RobustCap - Merging Monocular Images with IMU Signals for Efficient Motion Capture

Introduction to RobustCap

RobustCap is an innovative project presented in the SIGGRAPH ASIA 2023 conference, which focuses on a novel approach to real-time human motion capture. The project combines the power of monocular images and sparse IMU (Inertial Measurement Unit) signals to achieve accurate and efficient motion tracking. This technology is beneficial for various applications, such as sports analytics and virtual reality.

System Overview

The RobustCap system is implemented and evaluated with precision, as outlined in the project repository. The research article associated with this project is available on arXiv, and more information can be found on the project page. The system supports real-time performance even under challenging conditions such as occlusion, fast sports movements, and different lighting settings.

Installation Guide

To get started with RobustCap, users need to set up a specific Python environment using Conda:

conda create -n RobustCap python=3.8
conda activate RobustCap
pip install -r requirements.txt

It is also essential to install the appropriate PyTorch CUDA version from the official PyTorch website.

Data Requirements

RobustCap requires specific data components for operation:

SMPL Files: Users need to download these files from either a specified Google Drive link or the official website and place them in the models/ directory.
Pre-trained Models and Test Data: These can be obtained from a Google Drive link and placed in the data/ directory.
Evaluation Data: For AIST++ evaluations, users should download the non-aligned files and add them to the data/dataset_work/AIST path.

Evaluation Process

The system includes evaluation scripts to validate performance across various datasets like AIST++, TotalCapture, 3DPW, and 3DPW-OCC. To evaluate the system, you can run:

python evaluate.py

Note that the results might slightly differ from the published paper due to optimization randomness.

Visualization Options

RobustCap supports rich visualization techniques to illustrate motion capture results:

Using Open3D or Overlay

The view_aist function within evaluate.py allows users to visualize specific sequences and camera settings. By setting vis=True, users can view overlay results, which requires additional data downloads.

Using Unity

Users can utilize the view_aist_unity function in evaluate.py for visualization.
This involves downloading Unity assets, creating a 3D project, and using specific Unity scripts to visualize motion data at a set frame rate.

Live Demonstration

For those interested in live demos, the system can integrate with six Xsens Dot IMUs and a monocular webcam:

Users need to configure appropriate IMU and camera parameters.
Camera calibration is necessary, and a script is provided for this purpose.
By connecting the IMUs and running provided scripts, users can render real-time results in Unity, providing an interactive experience.

Training the Model

To train the model, users can execute the script net/sig_mp.py.

Citation Information

For academic purposes, users are encouraged to cite RobustCap using the following reference format:

@inproceedings{pan2023fusing,
title={Fusing Monocular Images and Sparse IMU Signals for Real-time Human Motion Capture},
author={Pan, Shaohua and Ma, Qi and Yi, Xinyu and Hu, Weifeng and Wang, Xiong and Zhou, Xingkang and Li, Jijunnan and Xu, Feng},
booktitle={SIGGRAPH Asia 2023 Conference Papers},
pages={1--11},
year={2023}
}

In summary, RobustCap is an advanced solution for capturing human motion with high accuracy and precision, making it a valuable tool for research and application in dynamic scenarios.