#OpenMMLab
mmocr
Explore an open-source toolkit for text detection, recognition, and information extraction, built on PyTorch and MMDetection. It supports a wide range of text processing tasks with state-of-the-art models and allows customization of core components like optimizers and preprocessors. Features include visualization tools, validation utilities, and data converters. Suitable for researchers and developers, it supports various datasets and includes robust version 1.0.0 updates with new datasets and enhanced documentation, ideal for developing strong text-focused applications.
mmengine
MMEngine is a foundational library facilitating PyTorch-based deep learning model training, integrating popular frameworks such as ColossalAI and DeepSpeed. It offers various training strategies, a flexible configuration system, and supports multiple monitoring platforms including TensorBoard and MLflow. Serving as the training core for OpenMMLab projects, MMEngine is adaptable for diverse research areas and wider applications. The latest updates include customizable MLflow artifact locations and enhanced DeepSpeed parameter control.
mmcv
MMCV provides essential tools for computer vision research, featuring capabilities such as image and video processing, annotation, visualization, and various CNN supports. Excelling in high-performance CPU and CUDA operations, it supports Linux, Windows, and macOS. Compatible with PyTorch 1.8 to 2.0 and requires Python 3.7+. Version 2.0.0 introduces data transformation modules and a new naming scheme for easier use. Choose 'mmcv' for a full suite or 'mmcv-lite' for a streamlined experience. This library is designed to satisfy the requirements of researchers in deep learning model training.
mmagic
The toolkit supports advanced generative AI for various image and video editing tasks, powered by the OpenMMLab 2.0 framework. It integrates state-of-the-art models in text-to-image diffusion and 3D generation. Suitable for AIGC research, it facilitates efficient deep learning framework development with technologies such as GAN and CNN, and operates on Python 3.9+ and PyTorch 2.0+ for seamless AI-driven creative processes.
mmyolo
MMYOLO is an open-source toolbox in the OpenMMLab ecosystem designed for implementing YOLO series algorithms. It supports features like YOLOv5 instance segmentation and the real-time object detector RTMDet. The platform is optimized for speed and accuracy in tasks such as object detection, rotated object detection, and instance segmentation. Developed on PyTorch and MMDetection, it boasts extensive documentation and modular design. Training is significantly accelerated, achieving speeds 2.6 times faster than earlier versions. This toolkit is ideal for developers focusing on high-performance computer vision solutions.
mmdeploy
MMDeploy is an open-source toolset aimed at efficiently deploying deep learning models. It supports over 2,300 AI models across key frameworks like ONNX, NCNN, TRT, and OpenVINO. Compatible with a wide range of hardware, it facilitates the conversion of Torch models to various formats. Designed for the OpenMMLab ecosystem, it integrates with models from multiple codebases such as mmdet, mmseg, and mmocr. The platform includes a C/C++ SDK for extensive customization and supports multiple inference backends on operating systems like Linux, Windows, macOS, and Android, optimizing deployment efficiency.
mmaction2
MMAction2 is an open-source video understanding toolbox built on PyTorch, as part of the OpenMMLab initiative. It provides a flexible architecture for customization, supporting action recognition, localization, spatio-temporal detection, skeleton-based detection, and video retrieval tasks. The v1.2.0 release adds support for new models and datasets, including VindLU, MobileOne TSN/TSM, and MSVD video retrieval, accompanied by detailed documentation and unit tests.
mim
MIM simplifies the installation and management of OpenMMLab projects, offering a unified command interface for various tasks such as training, testing, and hyper-parameter searches. It supports both package and model management, allowing for seamless integration of new components while focusing on innovation. With customization options and released under the Apache 2.0 license, MIM serves as a practical tool for leveraging OpenMMLab resources effectively.
mmsegmentation
MMSegmentation is a semantic segmentation toolbox based on PyTorch featuring unified benchmarking and a modular design, supporting methods like DeepLabV3 and PSPNet. The latest version v1.x offers enhanced flexibility and features, including monocular depth estimation and open-vocabulary segmentation. Supported by comprehensive documentation and tutorials, the toolkit facilitates smooth migration from earlier versions. Discover the model zoo and community resources to effectively leverage the advanced tools provided by MMSegmentation.
mmdetection
Explore a versatile PyTorch-based toolbox for various detection tasks, emphasizing GPU-optimized efficiency and top-tier results. MMDetection, part of OpenMMLab, excels in object detection, instance, and panoptic segmentation, ensuring seamless customization and integration with open-source tools. It features cutting-edge solutions like RTMDet for real-time needs, balancing parameter and accuracy across model sizes. Enhanced with MMEngine for robust model training and MMCV for computer vision, it's an excellent resource for advanced research and practical applications.
mmpretrain
MMPreTrain is an open-source pre-training platform by OpenMMLab utilizing PyTorch 1.8+, featuring a variety of backbones and models for supervised, self-supervised, and multi-modality learning. It provides efficient and extensible solutions for image classification, visual question answering, and more, while recent updates expand capabilities with LLaVA 1.5 and Mini-GPT4. Its extensive model zoo and tutorials make it a valuable resource for advancing academic research.
mmdetection3d
MMDetection3D is an open-source library tailored for 3D object detection, efficiently handling diverse indoor and outdoor datasets such as KITTI and nuScenes. It provides versatile detectors for seamless integration into various applications, featuring efficient training and evaluation processes. The platform offers advanced models and customizable components supporting both 3D detection and semantic segmentation, making it a valuable tool for developing modern 3D detection technologies.
Feedback Email: [email protected]