TATS
Discover an innovative method for generating long-form videos using Time-Agnostic VQGAN and Transformer models. This system generates extensive frames from brief training sequences and supports video creation from text or audio inputs, offering diverse output options. Recent findings reveal discrepancies between FVD metrics and human evaluations, providing new insights. It also includes guidelines for setup and usage across different datasets, making it an essential resource for industry professionals.