Introduction to SegAnyGAussians (SAGA) Project
The SegAnyGAussians (SAGA) project is a fascinating exploration into 3D Gaussian segmentation. This initiative, officially documented in the SAGA paper, involves segmenting any 3D Gaussian models, greatly enhancing the capabilities of 3D object recognition and scene understanding. The project's implementation is rooted in advanced methodologies, such as those found in 3D Gaussian Splatting.
Installation Guide
For those interested in diving into the project, the installation process is straightforward. You can start by cloning the repository from GitHub using either the SSH or HTTPS method:
git clone [email protected]:Jumpat/SegAnyGAussians.git
or
git clone https://github.com/Jumpat/SegAnyGAussians.git
Next, install the necessary dependencies:
conda env create --file environment.yml
conda activate gaussian_splatting
You'll also need a pre-trained ViT-H model for SAM, available here. Place this model under ./third_party/segment-anything/sam_ckpt
.
Preparing the Data
SAGA utilizes specific datasets such as 360_v2, nerf_llff_data, and LERF. The directory hierarchy is designed for easy navigation and data management.
To kickstart the process, the first step involves pre-training the 3D Gaussians using models from 3DGS, simply run:
python train_scene.py -s <path to COLMAP or NeRF Synthetic dataset>
Processing Data
Post pre-training, you'll need to generate sam_masks and their mask scales. Use:
python extract_segment_everything_masks.py --image_root <path to the scene data> --sam_checkpoint_path <path to the pre-trained SAM model> --downsample <1/2/4/8>
Additionally, gain the necessary mask scales:
python get_scale.py --image_root <path to the scene data> --model_path <path to the pre-trained 3DGS model>
If experimentation with open-vocabulary segmentation is desired, extract CLIP features using:
python get_clip_features.py --image_root <path to the scene data>
Training and Segmentation
Once your data is prepared, train the 3D Gaussian affinity features:
python train_contrastive_feature.py -m <path to the pre-trained 3DGS model> --iterations 10000 --num_sampled_rays 1000
SAGA also offers a graphical user interface (GUI) for 3D segmentation and a Jupyter Notebook for step-by-step guidance. To initiate the GUI:
python saga_gui.py --model_path <path to the pre-trained 3DGS model>
Using the GUI
The GUI features user-friendly controls for manipulating viewpoints and segmentations:
-
Viewpoint Control:
- Left drag to rotate
- Mid drag to pan
- Right-click for input prompts
-
Segmentation Control:
- Adjust parameters like scale and score thresholds
- Options to visualize RGB, PCA decomposition, similarity maps, and 3D clustering
The segmentation and clustering modes offer tools to handle either singular or multiple point prompts and adjust visual feedback for effective segmentation tasks.
Rendering
For rendering segmented models, save your results through the GUI and execute:
python render.py -m <path to the pre-trained 3DGS model> --precomputed_mask <path to the segmentation results> --target scene --segment
Conclusion
The SAGA project stands as a robust platform for 3D Gaussian segmentation. With easy-to-follow installation and data preparation guides, this tool is invaluable for those in 3D modeling and computational graphics research. The project is deeply grateful to other existing works like GARField, OmniSeg3D, and Gaussian Splatting for their contributions. For those who find SAGA instrumental in their research, they are encouraged to cite the project and support its development.
@article{cen2023saga,
title={Segment Any 3D Gaussians},
author={Jiazhong Cen and Jiemin Fang and Chen Yang and Lingxi Xie and Xiaopeng Zhang and Wei Shen and Qi Tian},
year={2023},
journal={arXiv preprint arXiv:2312.00860},
}