Semantic Segmentation Project Introduction
Semantic segmentation is an exciting field of computer vision that involves classifying each pixel in an image into a particular class. This project focuses on simplifying the use of state-of-the-art semantic segmentation models and providing extensive datasets to help developers build efficient segmentation solutions using PyTorch.
Overview
This project serves as a comprehensive resource for anyone looking to explore and implement semantic segmentation tasks. With a variety of high-accuracy models and diverse datasets, it caters to both traditional segmentation needs and customizable use cases. Future updates are expected to align with the latest PyTorch versions and include new models and documentation for custom datasets. The anticipated completion date for these updates is May 2024.
Planned Updates
The project is set to undergo significant improvements, aiming to enhance the entire training pipeline. Here's what users can look forward to:
- Integration of baseline pre-trained models and updated ideas.
- Seamless compatibility with state-of-the-art backbone models, supplemented by detailed tutorials.
- Guidance for employing the models with custom datasets.
- Distributed training support for better performance and scalability.
Additional planned changes include reducing the number of datasets and models to focus on significant examples while providing tutorials for custom datasets. Current augmentation methods will be replaced with official torchvisionv2 transforms, and the project will continue to support model conversion and inference with other frameworks.
Current Features
The project currently supports various tasks, including:
- Scene Parsing
- Human Parsing
- Face Parsing
- Medical Image Segmentation (upcoming)
In addition, it offers:
- Over 20 datasets
- More than 15 state-of-the-art backbones
- 10+ state-of-the-art semantic segmentation models
- Export and inference capabilities in PyTorch, ONNX, TFLite, and OpenVINO
Model Zoo
The project supports numerous backbones, including well-known models like ResNet, MobileNetV2, MiT, and ConvNeXt, each with its unique features and benefits. The supported segmentation methods include popular approaches like FCN, UPerNet, and SegFormer, alongside cutting-edge methods like CondNet and Lawin. Some standalone models such as BiSeNetv2 and DDRNet are also supported, offering flexibility in model choice.
Supported Datasets
The project can handle a wide range of datasets across various domains:
- Scene Parsing: Includes datasets like ADE20K and CityScapes.
- Human Parsing: Covers datasets such as MHPv2 and CIHP.
- Face Parsing: Features datasets including HELEN and CelebAMaskHQ.
- Other categories are also supported with datasets like SUIM for underwater scenarios.
For a full list of supported datasets, users can refer to the project's documentation.
Using the Project
Installation
Ensure you have Python 3.6 or above, along with PyTorch 1.8.1 and torchvision 0.9.1. Clone the repository and install the project with:
$ git clone https://github.com/sithu31296/semantic-segmentation
$ cd semantic-segmentation
$ pip install -e .
Configuration
Users need to create a configuration file in the configs
directory, which is essential for training, evaluation, and prediction processes.
Training
To train a model on a single GPU, use:
$ python tools/train.py --cfg configs/CONFIG_FILE.yaml
For multi-GPU training, enable the DDP
field in the config file and execute:
$ python -m torch.distributed.launch --nproc_per_node=2 --use_env tools/train.py --cfg configs/<CONFIG_FILE_NAME>.yaml
Evaluation and Inference
Configure the MODEL_PATH
in the configuration file to evaluate your trained model, or adjust the config parameters for inference to specify the desired models and datasets. The results will be saved in the designated SAVE_DIR
.
Export and Inference with Other Frameworks
The project allows conversion to ONNX and CoreML formats using:
$ python tools/export.py --cfg configs/<CONFIG_FILE_NAME>.yaml
It also supports inference in ONNX, OpenVINO, and TFLite formats using corresponding scripts.
Overall, the Semantic Segmentation project provides a robust and flexible framework for developing segmentation models tailored to specific needs while ensuring ease of use and high performance.