Project Icon

TransformerEngine

FP8 Precision Acceleration for Transformer Performance on NVIDIA GPUs

Product DescriptionTransformer Engine uses FP8 precision to accelerate Transformer models on NVIDIA Hopper GPUs, facilitating enhanced memory efficiency during training and inference. It includes optimized modules and a mixed-precision API for integration with deep learning frameworks, supporting architectures like BERT, GPT, and T5. With accessible Python and C++ APIs, Transformer Engine enables mixed-precision training, offering speed improvements with minimal accuracy changes. Compatible with major LLM libraries and supporting various GPU architectures, it is a versatile tool for NLP projects.
Project Details