Project Icon

vits

Enhanced End-to-End TTS Model with Variational and Adversarial Learning

Product DescriptionDiscover an innovative end-to-end TTS method that improves upon traditional two-stage systems using variational inference and adversarial learning. This approach enhances generative capabilities, resulting in natural-sounding speech. A stochastic duration predictor supports varied speech rhythms and tones from text. Human evaluations on the LJ Speech dataset demonstrate its superior performance, achieving MOS scores close to real human speech. Access the interactive demo for audio examples or explore available pretrained models.
Project Details