Introduction to ChatTTS-ui Project
ChatTTS-ui is a user-friendly web interface and API that allows users to convert text into speech easily. This project supports both English and Chinese languages, even handling a mix of numbers and letters seamlessly.
Origins and Essentials
ChatTTS-ui originated from the ChatTTS project. Since version 0.96, the project requires installing ffmpeg before deploying the source code. Previous versions of timber files such as csv and pt are obsolete, necessitating the generation of new timber values.
Sponsorship and Functionality
The primary functionality of ChatTTS-ui is rich, offering an easy-to-navigate platform where users pay-as-they-go without monthly fees. The interface separates management from use, making it highly accessible to individuals and businesses alike. This project is notably supported by 302.AI, a marketplace offering diverse AI solutions globally.
Interface and Features
In the web interface, users can input text which ChatTTS then synthesizes into speech. Here's a preview:
The tool impressively handles text with mixed characters, digits, and control symbols.
Deployment Options
Windows Pre-packaged Version
- Download the package from the Releases.
- Extract and run
app.exe
to get started. - Note: Security software may falsely flag it as a virus, in which case source deployment is recommended.
- Systems with NVIDIA GPUs (with over 4GB VRAM) can benefit from CUDA 11.8+ GPU acceleration.
Manual Model Download
Models are initially downloaded from huggingface.co or GitHub to the asset directory. In case of unstable networks, users can manually download and extract models, then place the pt files in the asset directory.
Linux Container Deployment
ChatTTS-ui can be deployed easily in a Docker container on Linux, ensuring it runs efficiently on both CPU and GPU environments. Follow these steps:
- Clone the project repository:
git clone https://github.com/jianchang512/ChatTTS-ui.git chat-tts-ui
. - Start the container using Docker Compose for either GPU or CPU.
- Access the interface via
IP:9966
.
Updating requires pulling the latest code from the main branch and rebuilding the Docker image.
Source Code Deployment
The project supports deployment across various platforms including Linux, MacOS, and Windows:
- Linux: Requires Python 3.9-3.11, ffmpeg, and corresponding drivers for CUDA or ROCm for GPU acceleration.
- MacOS: Similar setup as Linux with additional dependencies like libsndfile.
- Windows: Involves setting up a virtual environment and ensuring necessary installs for GPU acceleration.
Using ChatTTS-ui API
The API allows programmatic access to text-to-speech conversions with configurable parameters like voice, prompt, temperature, and more.
- Request Method: POST
- Endpoint:
http://127.0.0.1:9966/tts
- Parameters include
text
,voice
,prompt
, and settings for customizing the output.
Successful API calls return JSON data containing paths to the generated audio files.
Integration with pyVideoTrans
ChatTTS-ui can also integrate with pyVideoTrans from version 1.82, where users can select ChatTTS from the settings menu to convert subtitles into speech.
Conclusion
ChatTTS-ui is an open-source project facilitating text-to-speech synthesis through an easy-to-use web interface and API. Its versatility in deployment and integration potential makes it an attractive solution for both individuals looking to transform text into speech and developers seeking integration with other tools.