glow-tts
Glow-TTS uses a flow-based model for fast and parallel text-to-speech generation without external aligners. By using monotonic alignment search, it produces quick and varied speech with high quality, outperforming older models like Tacotron 2 in speed. It supports multi-speaker scenarios and long utterances with modifications like HiFi-GAN integration and blank tokens enhancing quality. Check out the demo and access pretrained models for use.