Adan
Adan is an adaptive Nesterov momentum algorithm that improves the optimization speed of deep learning models. It is supported by projects like NVIDIA's NeMo, Meta AI's D-Adaptation, and OpenMMLab's MMClassification. As the default optimizer for initiatives like DreamFusion's text-to-3D generation, Adan is recognized for its robustness and compatibility with higher learning rates. It excels in language and vision tasks while maintaining efficient memory usage. Explore significant advancements in large language models like MoE and GPT2 with Adan.