Project Icon

Phi2-mini-Chinese

Develop a Mini Chinese Language Model Utilizing Key Pre-processing and Optimization Strategies

Product DescriptionThe project provides a framework for training a mini Chinese language model from the ground up, focusing on pivotal pre-processing techniques such as data cleaning and tokenizer training. It also highlights flash attention acceleration and offers direction on unsupervised causal language model (CLM) pre-training using the BELLE dataset. Furthermore, it details fine-tuning through supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to improve model performance. Being experimental with potential major updates, it offers substantial insights into constructing proficient language models, as well as practical examples of conversational abilities and retrieval-augmented generation (RAG).
Project Details