T-Rex Project Overview
Introduction ๐
T-Rex2 represents the forefront of object detection technology in computer vision. Traditional models often struggle due to their limited training categories, a labor-intensive process that requires vast datasets and expert tuning. Moreover, introducing new categories necessitates starting the training process from scratch. T-Rex2 overcomes these constraints by combining text and visual prompts within a single model, providing it with strong zero-shot capabilities. This innovation makes T-Rex2 adept at identifying and locating objects in dynamic environments.
What Can T-Rex Do ๐
T-Rex2 is designed for versatile applications across various industries including agriculture, medicine, transportation, and retail. It supports three key workflows: interactive visual prompts, generic visual prompts, and text prompts. These workflows enable T-Rex2 to efficiently cover a wide range of object detection requirements in different scenarios. Its applications extend to areas like OCR in retail, monitoring wildlife, and beyond.
Try Demo ๐ฎ
Interested users can try out T-Rex2 through an online demo, accessible here.
API Usage Examples ๐
For developers and researchers, T-Rex2 offers free API access, facilitating educational and research purposes. You can request an API token here. The following APIs are available:
-
Interactive Visual Prompt API: Users specify objects to be detected through visual prompts in boxes or points on images.
-
Generic Visual Prompt API: Visual prompts from one reference image are applied to detect objects in another image.
-
Customize Visual Prompt Embedding API: Users can create custom visual embeddings for unique object categories using multiple images.
-
Embedding Inference API: This allows use of the visual prompt embeddings from previous steps to detect objects in various images.
Local Gradio Demo with API๐จ
Gradio, a user-friendly interface for interacting with models, is available for local testing of T-Rex2.
Setup and Operations:
- Setup: Requires installation of Gradio and dependencies.
- Run Gradio Demo: Launch it using a command line script with your API token.
- Basic Operations: Including drawing boxes and points to specify detected objects, and visual prompt workflows.
Conclusion
T-Rex2 is a revolutionary tool in object detection, using a blend of text and visual cues to enhance its adaptability and effectiveness across a wide range of applications. From agriculture to industry, it paves the way for more intelligent data handling and object recognition in varied environments.