Project Icon

SPPO

Enhance Model Efficiency with Self-Play Preference Optimization

Product DescriptionDiscover the SPPO framework for fine-tuning language models efficiently, without relying on external signals. Evaluate its validated performance across multiple datasets and the latest models' achievements in AlpacaEval 2.0. This open-source project provides extensive training scripts, evaluation methods, and troubleshooting support, advancing AI alignment research.
Project Details