Project Icon

Firefly-LLaMA2-Chinese

Efficient Bilingual Model Training with Extended Vocabulary

Product DescriptionThe project focuses on expanding the Chinese vocabulary in several models such as LLaMA2 and Baichuan2 using low-resource techniques. It demonstrates competitive performance on evaluations like the Open LLM Leaderboard and CMMLU. With the use of 22GB bilingual data and efficient GPU usage (4*V100), it surpasses models like Linly and Yayi, offering open-source weights and complete training data.
Project Details