Project Introduction to Chinese-LLaMA-Alpaca-3
Chinese-LLaMA-Alpaca-3 is a noteworthy project built on Meta's newly released next-generation open-source model, Llama-3. It is the third iteration of the Chinese-LLaMA-Alpaca open-source model series, following its predecessors from the first and second phases. This project releases both the Chinese Llama-3 base model and the Chinese Llama-3-Instruct fine-tuned model. These models enhance the original Llama-3's capabilities by incorporating large-scale Chinese data for incremental pre-training and fine-tuning with carefully selected instruction data, leading to significant improvements in understanding Chinese semantics and instructions over the second-generation models.
Key Features
- Open-source Models: The project offers the Llama-3-Chinese base model and the Llama-3-Chinese-Instruct instruction model in versions v1, v2, and v3.
- Pre-training and Fine-tuning: The pre-training scripts and instruction fine-tuning scripts are publicly available, allowing users to further train or fine-tune the models as needed.
- Instruction Data: It provides instruction fine-tuning data such as alpaca_zh_51k, stem_zh_instruction, and ruozhiba_gpt4 (4o/4T).
- Deployment Guides: Users are guided to quickly quantify and deploy large models locally using their computer CPU/GPU.
- Llama-3 Ecosystem Support: The project supports tools like Hugging Face's transformers, llama.cpp, text-generation-webui, vLLM, and Ollama.
News Updates
- May 30, 2024: Release of the Llama-3-Chinese-8B-Instruct-v3 model, showing significant improvements over previous versions in downstream tasks.
- May 8, 2024: Release of the Llama-3-Chinese-8B-Instruct-v2 model, fine-tuned using 5 million instructional data points.
- April 30, 2024: Launch of the Llama-3-Chinese-8B base model and instruction models.
- April 19, 2024: Official launch of the Chinese-LLaMA-Alpaca-3 project.
Model Overview
The project introduces the Chinese open-source models, Llama-3-Chinese and Llama-3-Chinese-Instruct, based on Meta's Llama-3, with the following features:
Vocabulary
Llama-3 has expanded its vocabulary size from 32K to 128K using a BPE (Byte Pair Encoding) vocabulary. Preliminary experiments show the Llama-3 vocabulary's encoding efficiency is similar to the expanded vocabulary of Chinese-LLaMA-2, reaching about 95% efficiency based on tests on Wikipedia data. Following findings from the Chinese Mixtral project, no further vocabulary expansion was conducted.
Contextual Capability
- The context length has been increased from 4K to 8K, enabling longer contextual information handling.
- Methods like PI, NTK, and YaRN allow users to extend model context for processing longer texts.
Query Attention Mechanism
Using the grouped query attention mechanism, initially applied in Llama-2’s larger parameter versions, Llama-3 increases its efficiency.
Instruction Template
The Llama-3-Instruct uses a new instruction template incompatible with Llama-2-chat, and users should follow the official instruction template for effective use.
Model Selection Guide
- Llama-3-Chinese-8B: A base model suitable for text extension tasks where the model generates subsequent text based on given content.
- Llama-3-Chinese-8B-Instruct: An instruction/Chat model ideal for tasks involving instruction understanding, such as Q&A, writing, chatting, and interaction.
For chatting and interaction, the Instruct version is the recommended choice.
Model Downloads
Different versions and types of the models are available for download, each suited to various uses. Instruct models are specially tuned for interaction tasks, with different versions offering improvements in language and long-text capabilities.
In conclusion, the Chinese-LLaMA-Alpaca-3 project represents a significant step forward in enhancing the capabilities of Chinese language processing and interaction models, providing advanced tools and guides for users looking to leverage these technologies in various applications.