Project Icon

mup

Stable Hyperparameter Tuning for Scalable Neural Networks Using Maximal Update Parametrization

Product DescriptionMaximal Update Parametrization (μP) ensures stable hyperparameter transfer across neural network sizes, effectively supporting large transformer models. This PyTorch-integrated open-source package minimizes scaling fragility and enhances performance predictability, making it essential for optimizing massive neural networks without extensive re-tuning.
Project Details