#object detection
YOLOX
YOLOX provides an efficient anchor-free object detection model, optimizing both accuracy and speed. Bridging research and practice, it supports PyTorch and MegEngine, includes JIT compile operations, and promises enhancements like YOLOX-P6. Explore the GitHub repository for demos and further insights.
darkflow
Darkflow utilizes the YOLO framework to deliver efficient real-time object detection and classification. Compatible with Python 3, TensorFlow, and OpenCV, it provides GPU and CPU support. Features include easy installation, flexible model configurations, and JSON output, making it ideal for scalable object detection across applications.
frigate
Frigate is a local NVR for Home Assistant that enhances IP camera surveillance using AI object detection. It leverages TensorFlow and OpenCV for efficient real-time object detection. Features such as low-overhead motion detection, multiprocessing for maximum FPS, and video recording ensure resource efficiency. Users benefit from seamless integration through MQTT communication, 24/7 recording, and convenient re-streaming using RTSP. Frigate supports low-latency live views with WebRTC and MSE, simplifying security operations through an advanced interface.
DAMO-YOLO
The DAMO-YOLO project, developed by the TinyML Team from Alibaba's DAMO Lab, enhances object detection beyond the YOLO series through innovative technologies like NAS backbones and efficient RepGFPN. It delivers powerful models and high-efficiency training tools, applicable to diverse scenarios including human and safety equipment detection. Updated models exhibit significant gains in real-time performance on both CPUs and GPUs. This project is resourceful for applications requiring precise and rapid object detection capabilities.
viseron
Viseron offers localized NVR and AI computer vision technologies with features such as object, motion, and face detection. Tailored for privacy-sensitive applications, it ensures secure monitoring of any space without external server dependency. Setup is simplified via Docker and a web interface, supported by comprehensive documentation and a component explorer. The project is open for community contributions in both development and enhancement.
mmyolo
MMYOLO is an open-source toolbox in the OpenMMLab ecosystem designed for implementing YOLO series algorithms. It supports features like YOLOv5 instance segmentation and the real-time object detector RTMDet. The platform is optimized for speed and accuracy in tasks such as object detection, rotated object detection, and instance segmentation. Developed on PyTorch and MMDetection, it boasts extensive documentation and modular design. Training is significantly accelerated, achieving speeds 2.6 times faster than earlier versions. This toolkit is ideal for developers focusing on high-performance computer vision solutions.
ares
ARES 2.0 is a Python library designed for evaluating adversarial robustness in image classification and object detection models. Utilizing PyTorch, it supports diverse attacks, offers robust training strategies, and includes trained checkpoints. The library facilitates distributed training and testing, making it suitable for adversarial machine learning research. Comprehensive setup instructions and detailed documentation support ease of use and implementation.
sports
Discover innovative tools in sports analytics within this repository, which includes object detection, image segmentation, and keypoint detection. Contributions are encouraged to tackle challenges like ball tracking and player re-identification. The repository supports Python environments and uses open-source opportunities to advance player dynamics analysis. Access sports datasets on Roboflow Universe to collaboratively improve sports analytics technology.
yolov10
YOLOv10 advances real-time object detection by improving architectural efficiency and eliminating NMS, offering a balanced design for speed and accuracy. This PyTorch implementation notably outperforms previous models, enhancing performance while reducing computational demands, ideal for applications demanding swift, efficient detection.
FCOS
FCOS streamlines object detection by removing anchor boxes, enhancing both speed and accuracy over previous models such as Faster R-CNN. Utilizing ResNeXt backbones and deformable convolutions, FCOS reaches up to 49% AP on COCO datasets with multi-scale testing. Its efficient design allows its use on less powerful hardware and integrates seamlessly with Detectron2 and mmdetection for broader application.
sahi
SAHI provides an efficient vision library for large-scale object detection and segmentation. It tackles the challenges of small object detection and large image inference, enhancing accuracy in practical applications. Supporting frameworks like YOLOv5, MMDetection, and Detectron2, it seamlessly integrates into current workflows. Features include sliced and standard prediction, COCO dataset functionalities, error analysis, and interactive visualization. Widely adopted with over 200 citations and competitive successes, SAHI advances computer vision technology.
tensorflow-yolov3
This project delivers an implementation of YOLOv3 using TensorFlow 2.0, ensuring compatibility and improvements over older versions. It facilitates rapid deployment with pre-trained models, supports training with custom datasets, and offers starting options with COCO weights. The well-documented scripts and guides make it accessible for both hobbyists and professionals interested in exploring YOLOv3's potential in TensorFlow.
nanodet
NanoDet-Plus is a lightweight object detection model known for its speed and accuracy, especially on mobile platforms. It supports backends such as ncnn, MNN, and OpenVINO and offers up to 34.3 mAP and 97fps on mobile ARM CPUs. With its Assign Guidance Module and Dynamic Soft Label Assigner, the model significantly improves accuracy without extensively using GPU resources. These attributes make it a suitable choice for various object detection needs in real-time applications.
ComfyUI-YoloWorld-EfficientSAM
This project presents an unofficial implementation of YOLO-World and EfficientSAM models tailored for ComfyUI, emphasizing practical object detection and segmentation. Version 2.0 enhances functionality with mask separation and extraction, compatible with images and videos. The project facilitates model loading for YOLO-World and EfficientSAM on both CUDA and CPU platforms. It offers features like confidence and IoU threshold setting, and customizable mask outputs. Contributions such as the Yoloworld ESAM Detector Provider add value, with user-friendly installation and thorough workflows, supporting detailed detection tasks.
mmdetection
Explore a versatile PyTorch-based toolbox for various detection tasks, emphasizing GPU-optimized efficiency and top-tier results. MMDetection, part of OpenMMLab, excels in object detection, instance, and panoptic segmentation, ensuring seamless customization and integration with open-source tools. It features cutting-edge solutions like RTMDet for real-time needs, balancing parameter and accuracy across model sizes. Enhanced with MMEngine for robust model training and MMCV for computer vision, it's an excellent resource for advanced research and practical applications.
CoDet
Explore co-occurrence guided region-word alignment in open-vocabulary object detection using large-scale image-text data. CoDet achieves leading performance on datasets such as LVIS and COCO and integrates with tools like Roboflow for efficient automated labeling. Access details on installation, model setup, and training to enhance detection accuracy and efficiency.
EasyCV
EasyCV is a PyTorch-based computer vision toolkit specializing in self-supervised learning and transformer architectures. It covers various tasks such as image classification, object detection, and pose estimation. The toolkit features state-of-the-art self-supervised learning algorithms, Vision Transformers, and supports extensive functionality and scalability. EasyCV facilitates multi-GPU and multi-worker training, with accelerated data processing using DALI and training optimizations through TorchAccelerator and fp16. Recent updates introduce models like YOLOX-PAI and STDC, enhancing its capabilities in segmentation and video recognition. It integrates with PAI-EAS for seamless online deployment and monitoring.
GLEE
GLEE offers a robust solution for object detection and segmentation, trained on over ten million images from diverse datasets. It excels in zero-shot transferability and versatility for both images and videos. The model features an integrated system including an image encoder, text encoder, visual prompter, and object decoder, supporting tasks like multi-target tracking and video instance segmentation. Explore its interactive capabilities in open-world scenarios with demos on HuggingFace and YouTube.
yolort
This project combines training and inference for object detection using a dynamic shape strategy, based on the YOLOv5 model framework. It incorporates pre-processing and post-processing directly into the model graph, thereby facilitating deployment on platforms such as LibTorch, ONNX Runtime, TVM, and TensorRT. The design takes cues from Ultralytics's YOLOv5, ensuring familiarity for those used to torchvision's models. Recent enhancements include TensorRT C++ interface integration and expanded ONNX Runtime support. The project offers simple installation via PyPI or source with minimal dependencies, enhancing the efficiency of both Python and C++ deployment.
edgeyolo
EdgeYOLO advances object detection capabilities on edge devices such as Nvidia Jetson AGX Xavier by achieving 34FPS and 50.6% AP on the COCO2017 dataset. The project improves detection of smaller objects with innovative loss functions and data augmentation. Updates include conversion support for ONNX to OM for Huawei Ascend, Docker-based environments for model training, and deployment across various edge platforms. The project also incorporates TensorRT integration and cross-platform demo capabilities, with future enhancements in segmentation tasks and model variations. Refer to the arXiv publication for comprehensive insights.
deep_learning_object_detection
Explore a vast repository of papers focused on deep learning methods for object detection. This continually updated resource includes the latest research from major conferences like CVPR, ICCV, and NeurIPS, offering perspectives on the development of deep learning object detection. Review detailed performance tables and identify key articles for researchers and enthusiasts. Learn about significant breakthroughs such as R-CNN, YOLO, and Faster R-CNN for the latest advances in object detection.
yolov3
YOLOv3, an open-source AI by Ultralytics, excels in object detection, segmentation, and classification. It focuses on speed, accuracy, and ease of use, integrating strategies from extensive R&D. New users can explore detailed guides, join a strong community, and leverage enhanced AI platform integrations, benefiting diverse global developers.
yolov5
Discover cutting-edge vision AI techniques driving visual intelligence forward. Leveraging extensive research and refined practices, this project excels in object detection, image segmentation, and classification. Access detailed resources and guides while a vibrant user community aids in optimizing AI potential across various fields. Connect via GitHub for issues or join community discussions on Discord to utilize top-tier AI tools effectively.
detrex
detrex is an open-source toolbox offering cutting-edge Transformer-based detection algorithms. It is built on Detectron2 and features a modular design for custom models and robust baselines. The project is user-friendly and lightweight, incorporating a LazyConfig System and training engine. detrex supports models like Focus-DETR and SQR-DETR and uses PyTorch 1.10+ for integration. Regular updates and comprehensive tutorials enhance usability. Explore detrex's project page for detailed features, documentation, and training techniques.
Exclusively-Dark-Image-Dataset
The ExDark dataset includes 7,363 images captured in various low-light conditions, providing a valuable resource for research in object detection and image enhancement. It features 12 annotated object classes similar to PASCAL VOC, essential for low-light image analysis. The open-source project, backed by CVIU publications, offers code for image enhancement and is governed by a BSD-3 license.
DINO
DINO, featuring improved de-noising anchors, enhances Detection Transformers for superior object detection capabilities. It excels in both universal and open-set detection and segmentation tasks, showcasing significant performance on COCO benchmarks with a compact model. Utilizing ResNet and Swin Transformer backbones, DINO promises quick convergence and precision. Innovative variants like Mask DINO and Stable-DINO offer straightforward training and adaptability across diverse detection scenarios. The model zoo provides access to the latest checkpoints, supporting extensive multi-scale training and inference.
TF-ID
Discover TF-ID's suite of models designed to identify tables and figures in academic papers, featuring MIT-licensed resources. Choose between models optimized for text captions or streamlined extraction, leveraging Microsoft's Florence-2 for seamless integration. Achieve accurate results with up to 98% success rate, supported by comprehensive training and implementation guides for scholarly contexts.
inference
The Inference platform facilitates deploying computer vision models, providing tools for object detection, classification, and segmentation. It includes support for foundational models such as CLIP, Segment Anything, and YOLO-World. Available as a Python-native package, a self-hosted server, or through an API, it is compatible with Python 3.8 to 3.11 and supports CUDA for GPU acceleration. With minimal dependencies and model-specific installations, it enhances performance. The Inference SDK supports local model execution with minimal code, handling various image input formats. Explore advanced features, an inference server via Docker, and comprehensive documentation for optimal utility.
Android-TensorFlow-Lite-Example
The Android TensorFlow Lite Example illustrates how to integrate TensorFlow Lite into Android apps, utilizing the device camera for object detection tasks. This project serves as an essential guide for developers seeking to implement machine learning capabilities effortlessly on mobile platforms. By embedding TensorFlow Lite models, developers can enhance their applications with advanced AI features, ensuring efficient performance on mobile devices. This example is an invaluable tool for both novice and experienced developers interested in AI-driven solutions in Android development.
yolor
The project uses an advanced framework to enhance real-time object detection across different tasks. By integrating models like YOLOR-CSP, YOLOR-CSP-X, and YOLOR-P6, the project shows significant improvements in Average Precision metrics on COCO datasets. It employs innovative features, offering enhanced processing speed and accuracy, making it a valuable tool for researchers and developers.
boxmot
BoxMOT offers state-of-the-art, flexible multi-object trackers for segmentation, detection, and pose estimation, adaptable to various hardware from CPUs to GPUs. Compatible with YOLO models, it integrates advanced ReID systems and reduces experimentation overhead with efficient data handling.
SSD-Tensorflow
This open-source project re-implements Single Shot MultiBox Detector in TensorFlow with a focus on modular VGG-based SSD networks. It supports popular datasets like Pascal VOC, offers tools for easy training and evaluation, and provides options for extending to networks like ResNet. Ideal for computer vision research.
ultralytics
Explore up-to-date tools for object detection and image processing with YOLO11, a cutting-edge model known for its rapid speed, accuracy, and versatility. This model supports a variety of tasks such as object detection, tracking, instance segmentation, image classification, and pose estimation. Building on the success of prior YOLO models, it delivers enhanced performance and intuitive features. Access detailed documentation, community forums, and installation options for seamless project integration. Engage with the developer community for further insights and practical applications of YOLO11.
autodistill
Autodistill enhances AI model development by converting unlabeled images into inferable models suitable for edge deployment without manual input. Utilizing comprehensive foundation models for automatic dataset labeling, it specializes in vision tasks such as object detection and instance segmentation. Autodistill provides a modular interface for seamless model integration and deployment, making it a practical tool for developers focused on efficiency and performance.
Rectlabel-support
Discover a versatile tool designed for precise image annotation, featuring polygon and pixel labeling with Segment Anything models and automated Core ML-based labeling. The tool also includes advanced text recognition capabilities, customizable settings for quick labeling, and a gallery view for easy image management. Export projects in multiple formats like COCO, Labelme, and YOLO, making this solution ideal for enhancing image labeling efficiency.
deepstream_python_apps
Discover NVIDIA DeepStream SDK 7.1 Python applications for building advanced object detection and analytics pipelines on Ubuntu 22.04. Utilize Gst Python for accessing metadata and explore varied applications from RTSP streaming to cloud analytics, complete with detailed guides.
JSON2YOLO
Explore a toolkit designed to streamline the conversion of COCO datasets to the YOLO format, optimizing real-time object detection processes. Suitable for users of the Darknet framework, it supports Linux, MacOS, and Windows platforms. Start with a Python environment and required dependencies, and engage with the community on GitHub. Available licensing options cater to both academic and commercial needs.
YOLOMagic
YOLO Magic extends the YOLOv5 framework with advanced network modules and a user-friendly web interface for enhanced visual task performance. It includes spatial pyramid modules, feature fusion structures, and new backbone networks to improve efficiency. Suitable for both beginners and experts, it streamlines image inference and model processes. The active community offers extensive resources for customization and learning. Explore YOLO Magic for top-tier object detection and analysis.
MaskDINO
This open-source project presents a unified transformer-based architecture that enhances object detection and segmentation. It supports panoptic, instance, and semantic segmentation, and demonstrates data cooperation across tasks. The project is compatible with COCO, ADE20K, and Cityscapes datasets, offering a range of pre-trained models and tools for various AI applications.
YOLOv8-TensorRT-CPP
This C++ implementation of YOLOv8 via TensorRT excels in object detection, semantic segmentation, and body pose estimation. Optimized for GPU inference, the project utilizes the TensorRT C++ API and facilitates integration with ONNX models converted from PyTorch. The project runs on Ubuntu, necessitating CUDA, cudnn, and CUDA-supported OpenCV. Users will find comprehensive setup instructions, model conversion guidance, and INT8 inference optimization tips. This project is ideal for developing high-performance vision applications on NVIDIA GPUs.
Feedback Email: [email protected]