Project Icon

evalscope

Holistic Evaluation and Benchmarking Framework for AI Models

Product DescriptionDiscover a comprehensive framework designed for evaluating and benchmarking diverse AI models, including large language models and multimodal variants. EvalScope provides end-to-end evaluation capabilities, supports custom datasets through user-friendly interfaces, and integrates with the ms-swift framework. Access a variety of evaluation backends such as OpenCompass and VLMEvalKit for in-depth analysis and performance stress testing, enabling precise model assessments with detailed reports and visualization support.
Project Details