low-bit-optimizers
Explore memory-efficient neural network training with 4-bit optimizers, reducing state bitwidth from 32-bit to 4-bit without sacrificing accuracy in tasks such as natural language processing and image classification. This solution supports major optimizers like AdamW and SGD, offering seamless integration and customizable quantization settings.