Project Icon

StableTTS

Multilingual Flow-Matching TTS Model Integrating Cutting-Edge Technologies

Product DescriptionStableTTS is a state-of-the-art flow-matching TTS model that integrates DiT, supporting efficient speech generation across Chinese, English, and Japanese. This 31M parameter model enhances audio quality and supports CFG and FireflyGAN vocoders, with improvements in the Chinese text frontend. The newly released version 1.1 introduces features like U-Net-inspired skip connections and a cosine timestep scheduler, all within a single multilingual checkpoint. Designed for user-friendly training, it simplifies data preparation and finetuning, making it an adaptable solution for varied audio generation applications.
Project Details