#Text-to-speech

Logo of tts-generation-webui
tts-generation-webui
This comprehensive platform offers text-to-speech and audio generation with support for models such as Bark, Tortoise, and MusicGen. Recent system improvements include enhanced extension management, UI adjustments, and technical upgrades. Suitable for developers and general users, the platform facilitates easy integration and updates in audio generation.
Logo of deepvoice3_pytorch
deepvoice3_pytorch
Discover PyTorch's convolutional network-based models designed for text-to-speech synthesis, supporting both multi-speaker and single-speaker applications. The project features attention mechanisms, access to audio samples, and compatibility with datasets like LJSpeech, JSUT, and VCTK. It also offers extensive frontend text processing for English and Japanese, enabling efficient text-to-speech conversion. Users can benefit from downloadable demos, diverse model presets, and detailed documentation to tailor TTS solutions effectively.
Logo of mandarin-tts
mandarin-tts
This modular Mandarin TTS framework supports fast research and product development. Configurable via YAML, it includes speaker and prosody embeddings, multiple vocoder options, and variance predictors. It aids in efficient training and synthesis with pre-trained checkpoints. Open for contributions with accessible audio samples.
Logo of glados-tts
glados-tts
The glados-tts project provides a neural network TTS engine that supports local and remote use, offering models trained on diverse datasets such as LJSpeech and the enhanced Ellen McClain dataset. Its multispeaker capability and optimized model performance ensure efficient voice synthesis. Installation involves downloading model files and installing Python dependencies. Suitable for simple local tests or more sophisticated setups, this TTS engine is flexible and powerful.