Project Icon

VLM_survey

In-depth Analysis of Vision-Language Models in Visual Recognition Tasks

Product DescriptionExamine the evolution of Vision-Language Models in diverse visual tasks like image classification and object detection. This survey reviews the architectural frameworks, pre-training techniques, and datasets of these models, highlighting their role in zero-shot learning. It categorizes VLM methods into pre-training, transfer, and distillation, providing thorough analysis and future research avenues. Featured in TPAMI's Top 50, this publication is essential for understanding vision-language AI.
Project Details