semantic-segmentation - Explore Advanced Semantic Segmentation Models with Extensive Customization in PyTorch

Semantic Segmentation Project Introduction

Semantic segmentation is an exciting field of computer vision that involves classifying each pixel in an image into a particular class. This project focuses on simplifying the use of state-of-the-art semantic segmentation models and providing extensive datasets to help developers build efficient segmentation solutions using PyTorch.

Overview

This project serves as a comprehensive resource for anyone looking to explore and implement semantic segmentation tasks. With a variety of high-accuracy models and diverse datasets, it caters to both traditional segmentation needs and customizable use cases. Future updates are expected to align with the latest PyTorch versions and include new models and documentation for custom datasets. The anticipated completion date for these updates is May 2024.

Planned Updates

The project is set to undergo significant improvements, aiming to enhance the entire training pipeline. Here's what users can look forward to:

Integration of baseline pre-trained models and updated ideas.
Seamless compatibility with state-of-the-art backbone models, supplemented by detailed tutorials.
Guidance for employing the models with custom datasets.
Distributed training support for better performance and scalability.

Additional planned changes include reducing the number of datasets and models to focus on significant examples while providing tutorials for custom datasets. Current augmentation methods will be replaced with official torchvisionv2 transforms, and the project will continue to support model conversion and inference with other frameworks.

Current Features

The project currently supports various tasks, including:

Scene Parsing
Human Parsing
Face Parsing
Medical Image Segmentation (upcoming)

In addition, it offers:

Over 20 datasets
More than 15 state-of-the-art backbones
10+ state-of-the-art semantic segmentation models
Export and inference capabilities in PyTorch, ONNX, TFLite, and OpenVINO

Model Zoo

The project supports numerous backbones, including well-known models like ResNet, MobileNetV2, MiT, and ConvNeXt, each with its unique features and benefits. The supported segmentation methods include popular approaches like FCN, UPerNet, and SegFormer, alongside cutting-edge methods like CondNet and Lawin. Some standalone models such as BiSeNetv2 and DDRNet are also supported, offering flexibility in model choice.

Supported Datasets

The project can handle a wide range of datasets across various domains:

Scene Parsing: Includes datasets like ADE20K and CityScapes.
Human Parsing: Covers datasets such as MHPv2 and CIHP.
Face Parsing: Features datasets including HELEN and CelebAMaskHQ.
Other categories are also supported with datasets like SUIM for underwater scenarios.

For a full list of supported datasets, users can refer to the project's documentation.

Using the Project

Installation

Ensure you have Python 3.6 or above, along with PyTorch 1.8.1 and torchvision 0.9.1. Clone the repository and install the project with:

$ git clone https://github.com/sithu31296/semantic-segmentation
$ cd semantic-segmentation
$ pip install -e .

Configuration

Users need to create a configuration file in the configs directory, which is essential for training, evaluation, and prediction processes.

Training

To train a model on a single GPU, use:

$ python tools/train.py --cfg configs/CONFIG_FILE.yaml

For multi-GPU training, enable the DDP field in the config file and execute:

$ python -m torch.distributed.launch --nproc_per_node=2 --use_env tools/train.py --cfg configs/<CONFIG_FILE_NAME>.yaml

Evaluation and Inference

Configure the MODEL_PATH in the configuration file to evaluate your trained model, or adjust the config parameters for inference to specify the desired models and datasets. The results will be saved in the designated SAVE_DIR.

Export and Inference with Other Frameworks

The project allows conversion to ONNX and CoreML formats using:

$ python tools/export.py --cfg configs/<CONFIG_FILE_NAME>.yaml

It also supports inference in ONNX, OpenVINO, and TFLite formats using corresponding scripts.

Overall, the Semantic Segmentation project provides a robust and flexible framework for developing segmentation models tailored to specific needs while ensuring ease of use and high performance.