gta
Learn how the Geometry-Aware Attention mechanism enhances the functionality of multi-view transformers, facilitating applications such as image generation. Presented at ICLR2024, this method offers a straightforward way to improve multi-view transformers and demonstrates its effectiveness in 2D tasks. Review our experiment results and code examples across datasets like CLEVR-TR, MSN-Hard, and ImageNet with Diffusion Transformers (DiT), showcasing GTA's capabilities for both multi-view and image Vision Transformers (ViT).