Project Icon

MDT

Accelerate Image Synthesis with Masked Diffusion Transformer's Advanced Contextual Abilities

Product DescriptionMDTv2 excels in image synthesis, achieving a cutting-edge FID score of 1.58 on ImageNet. It boasts a learning speed over 10 times faster than DiT by utilizing a unique masked latent modeling scheme that improves contextual learning. MDTv2 effectively reconstructs complete images, enhancing training efficiency and output quality, positioning it as a robust tool for sophisticated image generation.
Project Details