en

#Latte

The project presents an innovative approach to video generation using Latent Diffusion Transformers with PyTorch. It utilizes spatio-temporal token extraction and Transformer blocks for modeling video distribution in latent spaces, improving video quality on datasets such as FaceForensics and Taichi-HD. Including efficient model variants and extensions for text-to-video generation, the project achieves advanced performance benchmarks. The integration into diffusers also lowers GPU demands, facilitating access to efficient video creation infrastructures.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]