fish-speech
Explore a robust text-to-speech system offering zero-shot and few-shot functionalities across languages like English, Japanese, and Chinese. The platform supports fast processing with a real-time factor of 1:5 on an Nvidia RTX 4060 and maintains low character and word error rates. Features include a Gradio-based web UI and a PyQt6 interface for easy cross-platform deployment on Windows, Linux, and macOS, enhanced by fish-tech acceleration.