Segment and Track Anything (SAM-Track)
Introduction
Segment and Track Anything, or SAM-Track, is an innovative open-source project designed to perform object segmentation and tracking in videos. This project leverages both automatic methods and interactive user interfaces to identify and track objects efficiently. The primary technology behind SAM-Track includes Segment Anything Models (SAM), which automatically handle key-frame segmentation, and the Decoupling Features in Associating Objects with Transformers (DeAOT), used for effective multi-object tracking.
Features
SAM-Track offers several exciting features:
- Audio-Grounding: It can now track objects in videos that produce sound, providing a layer of interactivity between the video's audio and visual components.
- Demo Integration: New demonstrations have been added, which incorporate tools like Grounding-DINO for identifying new objects on key frames in various applications, such as smart cities and autonomous vehicle environments.
- Interactive Enhancements: The project includes several versions of an Interactive Web User Interface (WebUI), allowing users to add objects interactively using text prompts or image sequences.
- Memory Management: New parameters, such as
long_term_memory_gap
andmax_len_long_term
, allow the fine-tuning of memory use, balancing performance with operational limitations for long video footage.
Demonstrations
SAM-Track showcases its capabilities through various demonstrations:
- Versatile Use Cases: The project has been tested in diverse settings, including street views, animations, and aerial shots.
- Interactive Features: Users can modify object masks, add new objects interactively, and use text prompts for object specification.
- Track Many Objects: The platform supports simultaneous tracking of numerous objects and dynamically detects new entities within a video.
Getting Started
To begin using SAM-Track, users should prepare the required computational environment with Python 3.9 and supporting libraries. Installation scripts make setting up easier, and a straightforward demo notebook guides users through initial trials.
For those wanting a user-friendly interface, the project includes a WebUI app. This allows for the easy upload of video files and configuration of parameter settings to begin tracking tasks smoothly.
About the Team
SAM-Track is spearheaded by the ReLER Lab at Zhejiang University's College of Computer Science and Technology. The team, under the leadership of Professor Yang Yi, comprises several contributors who specialize in developing cutting-edge computer vision technologies.
Credits and Licensing
The project builds on work from other open-source initiatives such as DeAOT/AOT, SAM, and Grounding-Dino. Each borrowed code base comes with its respective licenses, and SAM-Track itself is licensed under AGPL-3.0. This ensures that the project remains open for development while protecting it against unauthorized commercial use.
Conclusion
Having established itself as a versatile and robust tool for video object segmentation and tracking, SAM-Track continues to evolve with exciting features and user-centric enhancements. It stands as a vivid testament to collaborative effort in the AI and machine learning community, inviting users globally to contribute, critique, and enhance its capabilities. Whether in academic research or practical applications, SAM-Track provides powerful capabilities that advance the potential of visual computing tasks.