GLEE
GLEE offers a robust solution for object detection and segmentation, trained on over ten million images from diverse datasets. It excels in zero-shot transferability and versatility for both images and videos. The model features an integrated system including an image encoder, text encoder, visual prompter, and object decoder, supporting tasks like multi-target tracking and video instance segmentation. Explore its interactive capabilities in open-world scenarios with demos on HuggingFace and YouTube.