Uni3D
Uni3D is a scalable 3D pretraining framework designed for large-scale representation learning with one billion parameters. It utilizes a 2D-initialized ViT to align 3D point cloud features with image-text models. By using 2D pretrained models and image-text alignment, Uni3D extends the capabilities of 2D models, achieving new standards in various 3D tasks. The open-sourced project includes tools for semantic coherence, model weights, evaluation code, and more, encouraging community collaboration and progress in multimodal intelligence.