Introduction to the Awesome Compression Project
The Awesome Compression project is focused on explaining various methods for compressing large language models. As the popularity of tools like ChatGPT increases, more and more large-scale language models are being developed. These models have shown incredible capabilities in solving numerous problems. However, they typically demand significant computational resources and memory, resulting in high resource consumption during operation. This consumption can limit their application in certain scenarios, often discouraging researchers from using these models. This project aims to explain compression techniques such as pruning, quantization, and knowledge distillation in an easy-to-understand way, making it accessible even to beginners interested in model compression.
Purpose of the Project
Currently, resources on model compression available online are quite scattered, making it difficult for beginners to find a straightforward, high-quality introduction to learn from. Inspired by MIT's 6.5940 TinyML and Efficient Deep Learning Computing course, this project provides an introduction to model compression, reducing the learning barriers for understanding this technology. Through tutorials, learners can familiarize themselves with different compression methods and learn how to apply these techniques to compress deep learning models to meet real-world application demands.
Target Audience
This project is suitable for the following learners:
- Researchers in deep learning.
- Developers involved in embedded systems and mobile application development.
- Developers interested in AI hardware acceleration and deployment.
- Students curious about model compression techniques.
Highlights of the Project
- Provides clear and easy-to-understand theoretical content to demystify model compression technology.
- Offers practical code examples to help learners better understand theoretical concepts through real-world scenarios.
Installation for Practical Environment
The project's practical code is based on Python 3.10. For detailed installation instructions, refer to: INSTALL.md.
Local Online Reading Environment Setup
Required Node.js Version
Node v16 is required.
Installing docsify
npm i docsify-cli -g
Starting docsify
docsify serve ./docs
Project Chapters
- Chapter 1: Introduction
- Chapter 2: Basics of CNN
- Chapter 3: Model Pruning
- Chapter 4: Model Quantization
- Chapter 5: Neural Architecture Search
- Chapter 6: Knowledge Distillation
- Chapter 7: Project Practice
If you are interested in large model compression, you are welcome to check out Datawhale's open-source project llm-deploy.
Contribution
- If you wish to participate in the project, feel free to check the project's Issue to find tasks that have not been assigned.
- If you discover any issues, please provide feedback through the Issue section. 🐛
- If you're interested in the project and want to join, you can engage through Discussion for interaction. 💬
If you are keen on Datawhale and wish to initiate a new project, you can refer to the Datawhale Contribution Guide.
List of Contributors
Name | Introduction |
---|---|
Chen Yuli | Member of Datawhale - Graduate student at Beijing University of Posts and Telecommunications |
Jiang Weiwei | Assistant Professor at Beijing University of Posts and Telecommunications |
Sun Hanyu | Model Deployment Engineer |
Zhang Yijie | Graduate student at Jinan University |
Wei Yukang | Graduate student at Hebei University of Science and Technology |
Ning Zhiyuan | Undergraduate student at Shanghai Jiao Tong University |
Acknowledgments
- Special thanks to @Sm1les, @LSGOMYP, and @Truth-14 for their assistance and support for this project.
- If you have any ideas, feel free to submit an Issue, and we welcome everyone to contribute more.
- Special thanks to the students who contributed to the tutorial!
Made with contrib.rocks.
Follow Us
Scan the QR code below to follow our public account: Datawhale
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.