Project Icon

megatts2

Utilize Zero-Shot Text-to-Speech with Cutting-Edge Synthesis Methods for Enhanced Audio Output

Product DescriptionDiscover the unofficial version of Mega-TTS 2, which integrates advanced techniques for speech synthesis. This project involves a blend of Chinese and English with a planned dataset of approximately 1,000 hours to enhance audio quality using Bigvgan. Through VQ-GAN, ADM, and PLM, it aims to elevate zero-shot TTS technologies. Detailed guidance is provided for dataset preparation, model training with Pytorch-lightning, and inference testing. Released under the MIT license and backed by Simon from ZideAI, this project supports wide-ranging language adaptations.
Project Details