Project Icon

makeMoE

Exploring Hackable Sparse Mixture of Experts: A Language Model Journey

Product DescriptionmakeMoE is a build-from-scratch sparse mixture of experts language model, drawing inspiration from Andrej Karpathy's makemore. It employs PyTorch and features innovations like top-k gating and Kaiming He initialization, maintaining a focus on Shakespeare-like text generation. This project is ideal for those exploring scalable and customizable language models, with resources provided on HuggingFace for in-depth understanding and efficient training.
Project Details