textbook_quality
The textbook_quality project generates high-quality, textbook-style pretraining data, offering flexibility from seed or scratch generation with OpenAI and user API connectivity. It includes robust retrieval options, such as Serply and SerpAPI, that can be customized or disabled. Its extensible framework supports environment-specific configurations and custom API adaptors, enhancing adaptability for efficient data generation.