EasyCV - Holistic Computer Vision Toolkit Focused on Self-Supervised Learning and Transformer Models

EasyCV: A Comprehensive Computer Vision Toolbox

Introduction

EasyCV is a robust, all-encompassing toolbox designed for computer vision tasks, leveraging the power of PyTorch. It mainly targets self-supervised learning, transformer-based models, and key computer vision tasks like image classification, metric learning, object detection, and pose estimation.

Major Features

State-of-the-art Self-Supervised Learning Algorithms

EasyCV integrates cutting-edge algorithms in self-supervised learning (SSL), including models like SimCLR, MoCo V2, Swav, DINO, and MAE for masked image modeling. It also offers benchmarking tools for evaluating these SSL models.

Vision Transformers

The toolbox facilitates easy access to state-of-the-art transformer models such as Vision Transformer (ViT), Swin Transformer, and the DETR Series, which are trained using both supervised and self-supervised learning techniques. Moreover, it supports all pretrained models from the popular library, timm.

Extensibility and Functionality

While focusing on areas like SSL, EasyCV also supports image classification, object detection, and metric learning. Its modular design lets developers easily extend it by adding new components or combining them with existing ones. It offers simple interfaces for inference, and its models can be seamlessly deployed on PAI-EAS for online service, enabling automatic scaling and service monitoring.

Efficiency

EasyCV is built for efficiency, supporting multi-GPU and multi-worker training. It implements DALI to enhance data I/O and preprocessing speeds and uses TorchAccelerator and fp16 to supercharge the training process. For inference, it optimizes models using JIT script and PAI-Blade.

What's New

Version 0.11.0 (Released on May 9, 2023):
- Integrated EasyCV as a plug-in for Modelscope.
Version 0.10.0 (Released on March 6, 2023):
- Introduced new segmentation model STDC and a skeleton-based video recognition model STGCN.
- Added support for ReID and Multi-lens MOT.
Version 0.9.0 (Released on January 17, 2023):
- Added support for Single-lens MOT.
- Supported video recognition models X3D and SWIN-video.
YOLOX-PAI (Released on August 31, 2022):
- Achieved state-of-the-art results within 40~50 mAP in under 1ms.
- Provided a robust export and prediction API for end-to-end object detection.

Technical Resources and Tutorials

EasyCV offers a plethora of tutorials and resources to help users get started and explore its capabilities:

Self-supervised learning, image classification, metric learning: Tutorials available for each specific area.
Object detection and model compression: Resources provided using YOLOX-PAI.
Advanced features: Tutorials on using mmdetection models and batch prediction tools within EasyCV.

Model Zoo and Data Hub

EasyCV's model zoo includes models across various categories such as Self-Supervised Learning, Image Classification, Object Detection, and Segmentation. The Data Hub provides dataset information for multiple scenarios, allowing for easy finetuning or evaluation of models.

License and Contact

The EasyCV project is licensed under the Apache License 2.0. For enterprise service support or more information, the project is maintained by the PAI-CV team, and users can reach out via DingDing or email.

Conclusion

In essence, EasyCV is a versatile toolbox that leverages modern advancements in computer vision and offers both breadth and depth in functionality. Whether you're engaging in research or deploying practical applications, EasyCV provides a powerful and extendable framework to build upon.