trl
TRL is a library for enhancing foundation models post-training via methods such as Supervised Fine-Tuning, Proximal Policy Optimization, and Direct Preference Optimization. Integrated with Hugging Face Transformers, it supports multiple architectures and scales from single GPUs to clusters. Its CLI allows model fine-tuning without coding, and dedicated trainers facilitate effective reinforcement learning, ensuring efficient hardware usage.