en

#trustworthiness

The TrustLLM project offers a thorough evaluation of the trustworthiness of large language models, featuring detailed benchmarks that cover dimensions like truthfulness, safety, and fairness. Evaluating 16 popular models using over 30 datasets, this toolkit facilitates efficient performance assessment, supports UniGen for dynamic evaluations, and includes regular updates with new models and bug fixes. The extensive dataset covers areas like misinformation and ethical scenarios. It is compatible with platforms such as Replicate and Azure OpenAI, offering easy evaluation. Detailed documentation and leaderboard data are available on the project website.

This repository emphasizes the multi-dimensional trustworthiness of large models, specifically in safety, security, and privacy. It includes a comprehensive collection of resources on multi-modal models such as vision-language and diffusion models. It serves as an essential tool for researchers by offering an extensive selection of papers and the option for users to recommend additional resources. The repository keeps users informed with recent academic advancements and categorizations, making it a critical asset for scholarly and practical pursuits.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]