GauHuman - Optimize human video processing with Gaussian splatting for swift training and real-time rendering

Overview of GauHuman

GauHuman is an innovative project developed by Shoukang Hu and Ziwei Liu at the S-Lab of Nanyang Technological University. It is set to be featured at the CVPR 2024 conference. The project focuses on creating articulated Gaussian splatting from monocular human videos, offering both rapid training times and real-time rendering capabilities. Impressively, it achieves fast training between 1 to 2 minutes and real-time rendering speeds of up to 189 frames per second (FPS). For more visual demonstrations, interested individuals can visit the project's official page.

Highlights and Recent Updates

The project has recently seen updates in December 2023, with the release of training and inference codes for datasets like ZJU-Mocap_refine and MonoCap. These advancements indicate the project's ongoing development and alignment with cutting-edge research in the field.

Technical Requirements

GauHuman necessitates the use of NVIDIA GPUs due to its computational demands. To facilitate easy environment setup, the developers recommend using Anaconda for Python environment management. Here are the key installation steps:

Create a new Anaconda environment tailored for GauHuman.
Install specific versions of PyTorch, Torchvision, and Torchaudio, along with CUDA support from NVIDIA.
Install various submodules like diff-gaussian-rasterization and simple-knn, including an upgraded version of KNN_CUDA.
Fulfill additional requirements by installing packages listed in the requirement.txt file.

Complex setups are simplified with comprehensive scripts and instructions provided by the developers.

Dataset Preparation

To work with GauHuman, datasets like ZJU-Mocap-Refine and MonoCap are essential. Users are guided to follow the dataset setup instructions available in the Instant-NVR repository, which offers a streamlined process for downloading and preparing the necessary data.

SMPL Models

The Neural Human Project requires the SMPL (Skinned Multi-Person Linear) models, which can be registered and downloaded from a specified platform. Users need only the neutral model, which should be organized within the project's directory as outlined in the instructions.

Training Procedures

GauHuman provides scripted commands to facilitate the training process, ensuring users can start training efficiently:

For training with the ZJU_Mocap_refine dataset, users execute a designated bash script.
Similarly, the MonoCap dataset also has its dedicated training script.

These commands highlight the project's user-friendly approach to complex tasks.

Evaluation Methods

Evaluation of the GauHuman models is also streamlined through specific commands tailored for each dataset:

Evaluation for ZJU_Mocap_refine is initiated using a particular bash script.
The MonoCap dataset follows suit with its unique evaluation script.

These tools assist researchers in assessing the performance and accuracy of their trained models.

Citation and License

Researchers and developers who find GauHuman beneficial in their pursuits are encouraged to cite the work using the provided BibTeX entry. The project is disseminated under the S-Lab License, further details of which can be found in the project's LICENSE file.

Acknowledgements

The GauHuman project is built upon and inspired by preceding works, which include source codes from projects like Gaussian-Splatting, HumanNeRF, and Animatable NeRF. This acknowledgment highlights the collaborative nature of advancements in machine learning and computer vision.

By encapsulating complex visual data into articulated Gaussian splatting, GauHuman stands at the forefront of transforming how we interpret and render human motion from simple video feeds. Its contribution to the field promises to set a new standard for efficiency and real-time processing.