#semantic segmentation

Logo of Pytorch-UNet
Pytorch-UNet
This PyTorch-based U-Net implementation enhances high-definition image segmentation, particularly for challenges like Kaggle's Carvana Image Masking. Featuring Docker for straightforward deployment and mixed precision optimization, the model boasts a Dice coefficient of 0.988423 across vast test sets. The project supports diverse segmentation applications, such as medical and portrait, and offers seamless training and inference with Weights & Biases for live training progress. Pretrained models are accessible for swift application.
Logo of SAN
SAN
This project presents SAN, a framework that employs a pre-trained vision-language model for open-vocabulary semantic segmentation by treating it as a region recognition task. It leverages a side network attached to the CLIP model to handle mask proposals and attention biases, ensuring efficient and accurate segmentation with minimal additional parameters. SAN is validated on standard benchmarks, showing improved performance with fewer parameters and faster inference. The design ensures compatibility with existing CLIP features, supporting end-to-end training for adaptability in semantic tasks without sacrificing precision.
Logo of urban_seg
urban_seg
Explore semantic segmentation with this project using the Unicom model pre-trained on 400 million images. Achieve efficient results with just four training images. Start quickly with 'train_one_gpu.py' or optimize performance with 'train_multi_gpus.py' for multi-GPU support. Follow setup guides for seamless configuration. Connect via QQ group 679897018 for discussions.
Logo of labelme
labelme
Labelme is a graphical image annotation tool that supports Python users in creating polygonal annotations with various shapes like rectangles, circles, and lines. It offers image and video annotation capabilities and allows GUI customization for intuitive use. The tool supports exporting to VOC and COCO formats suitable for semantic and instance segmentation. Comprehensive guides ensure it is accessible to users across different operating systems, making it ideal for developing precise image annotations in machine learning applications.
Logo of PaddleSeg
PaddleSeg
PaddleSeg is an image segmentation toolkit built on PaddlePaddle, offering streamlined processes from model training to deployment. Utilizes low-code tools like PaddleX, featuring 19 integrated segmentation models and simple Python APIs for ease of use. Supports over 200 models across tasks such as object detection and OCR. It includes 45+ model algorithms and 140+ pre-trained models for high accuracy and performance. Recent updates enhance multi-label segmentation, introduce lightweight vision models, and improve training compression methods, suitable for applications in healthcare, industrial, and remote sensing fields.
Logo of pytorch-3dunet
pytorch-3dunet
This PyTorch-based implementation supports 3D and 2D U-Nets, including variants like Residual 3D U-Net enhanced with Squeeze-and-Excitation blocks. It's designed for precise semantic segmentation and regression on medical and biological datasets. The tool accommodates multi-channel inputs, various loss functions for imbalanced data, and integrates smoothly with NVIDIA GPUs to boost training efficiency. With YAML configuration, it offers adaptable training and prediction processes across different platforms, such as Windows and OS X. Discover examples demonstrating its precision in cell segmentation, beneficial for enhancing research results.
Logo of Open3D-ML
Open3D-ML
Open3D-ML extends Open3D for 3D data processing with machine learning capabilities, supporting semantic point cloud segmentation, among other tasks. It is compatible with TensorFlow and PyTorch, offering pretrained models, training frameworks, and visualization functionalities, suitable for diverse 3D application needs.
Logo of DAFormer
DAFormer
DAFormer presents an advanced network architecture for improving unsupervised domain adaptation by overcoming limitations of previous models. It combines a Transformer encoder and a multi-level context-aware feature fusion decoder, enhanced by training strategies such as Rare Class Sampling and Learning Rate Warmup. These features result in a notable performance boost, with improvements of 10.8 mIoU for GTA to Cityscapes and 5.4 mIoU for Synthia to Cityscapes. Furthermore, DAFormer effectively handles generalization tasks without requiring access to the target domain, achieving state-of-the-art performance with a 6.5 mIoU enhancement. This design supports better learning of complex classes, making DAFormer a crucial tool in semantic image segmentation.
Logo of PointTransformerV3
PointTransformerV3
PointTransformerV3 offers an efficient approach to 3D point cloud segmentation, providing improved speed and accuracy in semantic segmentation tasks on benchmarks like nuScenes and ScanNet. The project is continually updated in Pointcept v1.5, supplying valuable resources such as model weights and experiment records. Selected for oral presentation at CVPR'24, it utilizes Flash Attention to enhance computational efficiency and support scalable multi-dataset 3D representation learning.
Logo of PyTorch-Encoding
PyTorch-Encoding
Delve into innovative neural network encoding methods in deep learning as pioneered by Hang Zhang, featuring ResNeSt's split-attention mechanism for leading semantic segmentation. Access extensive documentation within the PyTorch-Encoding model zoo, presenting models for image classification and segmentation. The project incorporates notable contributions such as Deep TEN and Context Encoding, serving as a vital resource for neural network encoding research. It emphasizes thorough unit testing and detailed build guides, aiding developers in achieving improved accuracy on datasets like ADE20K and Pascal Context.
Logo of pytorch-auto-drive
pytorch-auto-drive
PytorchAutoDrive is a Python framework designed to enhance self-driving perception through flexible semantic segmentation and lane detection models built on PyTorch. Its user-friendly configuration system and straightforward code offer research to application capabilities, with diverse backbone support, faster training, and superior model performance. The framework supports visualization, benchmarking, and deployment with ONNX and TensorRT. Compatibility with several major datasets allows for versatile applications in autonomous driving, providing a valuable tool for both researchers and developers.
Logo of mmsegmentation
mmsegmentation
MMSegmentation is a semantic segmentation toolbox based on PyTorch featuring unified benchmarking and a modular design, supporting methods like DeepLabV3 and PSPNet. The latest version v1.x offers enhanced flexibility and features, including monocular depth estimation and open-vocabulary segmentation. Supported by comprehensive documentation and tutorials, the toolkit facilitates smooth migration from earlier versions. Discover the model zoo and community resources to effectively leverage the advanced tools provided by MMSegmentation.