Awesome-LLMs-Evaluation-Papers
This objective overview explores various evaluation methods for large language models, focusing on assessing their knowledge, alignment, and safety. It presents an extensive collection of papers and methods curated by Tianjin University's team and underscores the necessity for meticulous evaluations to mitigate potential risks such as data leaks. The survey aims to foster responsible AI development and maximize societal benefits through a structured approach.