Project Icon

SEED-Bench

Thorough Assessment of Multimodal Large Language Models Across Diverse Dimensions

Product DescriptionSEED-Bench offers a structured evaluation setup for multimodal large language models with 28K expertly annotated multiple-choice questions across 34 dimensions. Encompassing both text and image generation evaluations, it includes iterations like SEED-Bench-2 and SEED-Bench-2-Plus. Designed to assess model comprehension in complex text scenarios, SEED-Bench is a valuable resource for researchers and developers looking to compare and enhance model performance. Explore datasets and engage with the leaderboard now.
Project Details