Project Icon

Chinese-FastSpeech2

Integrating Prosody for Improved Chinese Speech Synthesis with FastSpeech2

Product DescriptionThe project uses an improved FastSpeech2 model for Chinese speech synthesis, focusing on vibrant and rhythmic pronunciation. It includes prosody representation and prediction enhancements. Recent updates feature prosody model training code and data preprocessing for Biaobei data. The architecture integrates FastSpeech2 and HifiGAN, utilizing a prosody vector to form three models: fastspeech_model, hifigan_model, and prosody_model. It supports both command-line and API-based text-to-speech predictions and welcomes community input and feedback.
Project Details