TensorRT
Torch-TensorRT integrates TensorRT capabilities into PyTorch to boost inference speed by up to 5 times. Achieve optimal NVIDIA platform performance with simple code integration. Available for easy installation through PyPI and nightly versions from the PyTorch index, it supports the torch.compile method for swift optimization and a robust export workflow for C++ environments. Compatible with GPU on Linux and Windows and offers native compilation for aarch64 using JetPack. Resources include tools for Stable Diffusion acceleration and FP8 model execution with improved graph performance.