awesome-compression

Introduction to the Awesome Compression Project

The Awesome Compression project is focused on explaining various methods for compressing large language models. As the popularity of tools like ChatGPT increases, more and more large-scale language models are being developed. These models have shown incredible capabilities in solving numerous problems. However, they typically demand significant computational resources and memory, resulting in high resource consumption during operation. This consumption can limit their application in certain scenarios, often discouraging researchers from using these models. This project aims to explain compression techniques such as pruning, quantization, and knowledge distillation in an easy-to-understand way, making it accessible even to beginners interested in model compression.

Purpose of the Project

Currently, resources on model compression available online are quite scattered, making it difficult for beginners to find a straightforward, high-quality introduction to learn from. Inspired by MIT's 6.5940 TinyML and Efficient Deep Learning Computing course, this project provides an introduction to model compression, reducing the learning barriers for understanding this technology. Through tutorials, learners can familiarize themselves with different compression methods and learn how to apply these techniques to compress deep learning models to meet real-world application demands.

Target Audience

This project is suitable for the following learners:

Researchers in deep learning.
Developers involved in embedded systems and mobile application development.
Developers interested in AI hardware acceleration and deployment.
Students curious about model compression techniques.

Highlights of the Project

Provides clear and easy-to-understand theoretical content to demystify model compression technology.
Offers practical code examples to help learners better understand theoretical concepts through real-world scenarios.

Installation for Practical Environment

The project's practical code is based on Python 3.10. For detailed installation instructions, refer to: INSTALL.md.

Local Online Reading Environment Setup

Required Node.js Version

Node v16 is required.

Installing docsify

npm i docsify-cli -g

Starting docsify

docsify serve ./docs

Project Chapters

If you are interested in large model compression, you are welcome to check out Datawhale's open-source project llm-deploy.

Contribution

If you wish to participate in the project, feel free to check the project's Issue to find tasks that have not been assigned.
If you discover any issues, please provide feedback through the Issue section. 🐛
If you're interested in the project and want to join, you can engage through Discussion for interaction. 💬

If you are keen on Datawhale and wish to initiate a new project, you can refer to the Datawhale Contribution Guide.

List of Contributors

Name	Introduction
Chen Yuli	Member of Datawhale - Graduate student at Beijing University of Posts and Telecommunications
Jiang Weiwei	Assistant Professor at Beijing University of Posts and Telecommunications
Sun Hanyu	Model Deployment Engineer
Zhang Yijie	Graduate student at Jinan University
Wei Yukang	Graduate student at Hebei University of Science and Technology
Ning Zhiyuan	Undergraduate student at Shanghai Jiao Tong University

Acknowledgments

Special thanks to @Sm1les, @LSGOMYP, and @Truth-14 for their assistance and support for this project.
If you have any ideas, feel free to submit an Issue, and we welcome everyone to contribute more.
Special thanks to the students who contributed to the tutorial!