Introduction to Chinese-LLaMA-Alpaca-2 Project
Overview
The Chinese-LLaMA-Alpaca-2 project, built upon the commercial large language model Llama-2 released by Meta, is the second phase in the development of the Chinese LLaMA and Alpaca models. This project has been open-sourced to include the Chinese LLaMA-2 base model and Alpaca-2 instruction-tuning model, both of which have expanded and optimized the Chinese vocabulary from the original Llama-2. By utilizing extensive Chinese datasets for incremental pre-training, these models have significantly enhanced core semantic understanding and instructional comprehension in Chinese, compared to their first-generation counterparts. The models support FlashAttention-2 training, with standard versions accommodating a 4K context length and longer-context versions supporting up to 16K and 64K context lengths. The RLHF series models fine-tune human preference alignment on top of the standard version, thus improving performance in representing correct values.
Key Highlights of the Project
- Enhanced Chinese Vocabulary: The project introduces an updated Chinese vocabulary list, enhancing Chinese LLaMA-2 and Alpaca-2 models.
- Pre-training and Instruction Tuning Scripts: Open-sourced scripts are provided for further training the models, allowing customization based on specific needs.
- Local Quantization and Deployment: Users can experience fast quantization and deployment on local CPUs/GPUs.
- Support for Various Ecosystems: The models support platforms and tools such as 🤗Transformers, llama.cpp, text-generation-webui, LangChain, privateGPT, and vLLM.
Released Models
- Base Models (4K context): Chinese-LLaMA-2 in sizes 1.3B, 7B, 13B.
- Chat Models (4K context): Chinese-Alpaca-2 in sizes 1.3B, 7B, 13B.
- Extended Context Models (16K/64K):
- Chinese-LLaMA-2-16K and Chinese-Alpaca-2-16K (7B, 13B)
- Chinese-LLaMA-2-64K and Chinese-Alpaca-2-64K (7B)
- Preference Aligned Models: Chinese-Alpaca-2-RLHF (1.3B, 7B)
Model Features
Optimized Chinese Vocabulary
The project optimizes the Chinese vocabulary by redesigning new word lists that enhance coverage of Chinese tokens. This allows consistent LLaMA/Alpaca vocabulary use and avoids mistakes associated with mixed vocabulary sets, improving encoding and decoding efficiency for Chinese text.
Efficient Attention Mechanism
FlashAttention-2 implementation offers efficient attention mechanisms, providing faster processing and memory optimization, crucial for longer contexts to prevent memory overload.
Extended Context through PI and YaRN
The project implements context extension techniques based on Position Interpolation (PI) and YaRN methods. 16K context models, implemented via PI and NTK methods, can extend up to 24K-32K contexts. Furthermore, models utilizing the YaRN method support 64K contexts.
Simplified Bilingual System Prompts
Alpaca-2 models simplify system prompts by adhering to Llama-2-Chat directives while reducing verbosity compared to their predecessors.
Human Preference Alignment
The Alpaca series in this project, unlike earlier models, incorporate reinforcement learning from human feedback (RLHF) to significantly enhance the model's ability to reflect correct human values, leading to the Alpaca-2-RLHF model series.
Model Usage and Applications
The models are suitable for a range of applications such as text generation, instructional understanding, chat interactions, and more, depending on whether the base or instruction-tuned models are employed. For activities involving chat and interactive responses, Alpaca models are recommended for superior performance, thanks to their instruction-tuning and preference alignment features.
Conclusion
The Chinese-LLaMA-Alpaca-2 project represents a significant advancement in enhancing language models tailored for Chinese, integrating user feedback into development, and ensuring models are accessible and adaptable for an array of applications through comprehensive open-source resources and documentation. With the release of this project, users can expect enhanced functionality and more efficient processing capabilities in handling Chinese language tasks.