#image segmentation
U-2-Net
U²-Net is a deep learning model engineered for salient object detection utilizing a distinctive nested U-structure for enhanced segmentation accuracy. Awarded the 2020 Pattern Recognition BEST PAPER AWARD, U²-Net finds applications across mobile image processing and creative design tools. This model is accessible on platforms such as PlayTorch, ensuring integration on Android and iOS devices. It supports diverse functionalities like background removal and portrait creation, making it a flexible tool for developers and artists focusing on precise object detection.
SLiMe
SLiMe is an innovative image segmentation method that employs PyTorch and is recognized for its one-shot capabilities using Stable Diffusion. Designed for training, testing, and validation, it necessitates precise image and mask name matching and is compatible with well-known datasets like PASCAL-Part and CelebAMask-HQ. It offers easy integration with Colab Notebooks and a setup process that involves creating a virtual environment and installing dependencies. SLiMe supports customizable patch-based testing configurations, fostering novel segmentation applications and is backed by trained text embeddings along with comprehensive guides for optimizing performance on various image datasets.
ComfyUI-YoloWorld-EfficientSAM
This project presents an unofficial implementation of YOLO-World and EfficientSAM models tailored for ComfyUI, emphasizing practical object detection and segmentation. Version 2.0 enhances functionality with mask separation and extraction, compatible with images and videos. The project facilitates model loading for YOLO-World and EfficientSAM on both CUDA and CPU platforms. It offers features like confidence and IoU threshold setting, and customizable mask outputs. Contributions such as the Yoloworld ESAM Detector Provider add value, with user-friendly installation and thorough workflows, supporting detailed detection tasks.
EfficientSAM
EfficientSAM employs advanced masked image pretraining to boost segmentation performance across diverse applications. Integrated with popular tools like Labelme, and available on platforms such as Hugging Face Space in ONNX format, it showcases versatile instance segmentation through point, box, and saliency prompts. With accessible checkpoints and comprehensive model examples, EfficientSAM streamlines the implementation of segmentation tasks. Recent enhancements include torchscript and grounded demos, expanding usability and reach. Discover its full capabilities via online demos and Jupyter Notebook illustrations.
terratorch
TerraTorch enhances geospatial data modeling by utilizing pre-trained model backbones like Prithvi, SatMAE, and ScaleMAE, supporting tasks including segmentation, classification, and regression. Its flexible setups through pip or conda and adaptable configuration options facilitate simple integration for developers extending geospatial functionality.
GLEE
GLEE offers a robust solution for object detection and segmentation, trained on over ten million images from diverse datasets. It excels in zero-shot transferability and versatility for both images and videos. The model features an integrated system including an image encoder, text encoder, visual prompter, and object decoder, supporting tasks like multi-target tracking and video instance segmentation. Explore its interactive capabilities in open-world scenarios with demos on HuggingFace and YouTube.
PaddleSeg
PaddleSeg is an image segmentation toolkit built on PaddlePaddle, offering streamlined processes from model training to deployment. Utilizes low-code tools like PaddleX, featuring 19 integrated segmentation models and simple Python APIs for ease of use. Supports over 200 models across tasks such as object detection and OCR. It includes 45+ model algorithms and 140+ pre-trained models for high accuracy and performance. Recent updates enhance multi-label segmentation, introduce lightweight vision models, and improve training compression methods, suitable for applications in healthcare, industrial, and remote sensing fields.
EdgeSAM
EdgeSAM speeds up the Segment Anything Model (SAM) for edge devices, improving speed by 40 times with minimal performance trade-offs. It surpasses models such as MobileSAM and achieves better mIoU metrics on COCO and LVIS datasets. EdgeSAM can operate at more than 30 FPS on an iPhone 14. The model employs a sophisticated distillation process that involves the prompt encoder and mask decoder, enhancing the interaction between user input and mask generation. With available training and evaluation codes, EdgeSAM is deployable via ONNX and CoreML exports for a range of applications.
Feedback Email: [email protected]