LLM101n: Let's Build a Storyteller
LLM101n is an exciting project in development by Eureka Labs aimed at providing a deep dive into the world of artificial intelligence (AI), specifically focusing on building a Storyteller AI using Large Language Models (LLMs). Although the course is not yet available, it promises to guide participants through the process of creating an AI capable of crafting, refining, and illustrating stories.
Project Overview
The goal of this course is to equip participants with the knowledge and skills to build a fully functioning web application similar to ChatGPT, using Python, C, and CUDA with minimal prerequisites in computer science. It emphasizes hands-on learning by constructing everything from scratch, all the while demystifying AI, LLMs, and deep learning concepts. By the end of the course, participants are expected to have a robust understanding of these technologies.
Course Content
The course is structured into a series of chapters, each tackling different aspects of AI and LLM development:
- Chapter 01: Bigram Language Model - Explores basic language modeling techniques.
- Chapter 02: Micrograd - Introduces machine learning and backpropagation concepts.
- Chapter 03: N-gram Model - Delves into advanced topics like multi-layer perceptron, matrix multiplication, and activation functions.
- Chapter 04: Attention - Covers attention mechanisms, softmax functions, and positional encoders.
- Chapter 05: Transformer - A detailed study of transformers, residual connections, and normalization layers, with insights into GPT-2 architecture.
- Chapter 06: Tokenization - Focuses on byte pair encoding and tokenization techniques.
- Chapter 07: Optimization - Discusses initialization methods and optimization algorithms like AdamW.
- Chapter 08-10: Need for Speed - A trilogy focusing on device compatibility (CPU, GPU), precision in training, and distributed optimization techniques.
- Chapter 11: Datasets - Examines data handling strategies, including loading and generating synthetic data.
- Chapter 12-13: Inference - Looks at inference optimizations such as kv-cache and quantization.
- Chapter 14-15: Finetuning - Covers aspects of supervised fine-tuning, reinforcement learning, and related techniques.
- Chapter 16: Deployment - Guides participants through deploying their AI as a web app or API.
- Chapter 17: Multimodal - Explores multimodal AI, integrating images, audio, and video using advanced models like VQVAE.
Additional Topics
The appendix includes supplementary topics to enhance understanding:
- Various programming languages such as Assembly, C, and Python.
- Different data types including integers, floats, and strings in various encodings.
- Tensor manipulations involving shapes, strides, and views.
- Deep learning frameworks like PyTorch and JAX.
- Insights into advanced neural network architectures like GPT, Llama, and multimodal models.
In summary, LLM101n is a comprehensive project aimed at demystifying the inner workings of AI storytellers through engaging, hands-on learning experiences. Participants will journey through the fascinating world of AI, gaining valuable insights and practical skills along the way.