Introduction to the FunAudioLLM-APP Project
Welcome to the FunAudioLLM-APP project! This fascinating initiative combines cutting-edge audio comprehension and speech generation technologies to enrich users' audio experiences. At its core, the project features two innovative applications designed to enhance communication through advanced AI capabilities.
Key Applications
Voice Chat: The Voice Chat application is designed to offer an interactive and natural conversation experience. By utilizing sophisticated AI models, it can facilitate meaningful dialogues in various scenarios, making it easier for users to engage in advanced chat interactions.
Voice Translation: With the Voice Translation application, language barriers become a thing of the past. This real-time tool translates spoken languages instantly, enabling seamless and efficient communication between individuals who speak different languages.
For more detailed information, you can explore the following resources:
Related Resources
For those interested in the underlying technologies, here are links to specific repositories:
-
CosyVoice: Explore the CosyVoice repo and its corresponding CosyVoice space.
-
SenseVoice: Check out the SenseVoice repo and the related SenseVoice space.
Installation Guide
To get started with the FunAudioLLM-APP project, follow these steps:
Clone and Install
-
Clone the repository and its submodules:
git clone --recursive URL
-
If there are network issues while cloning submodules, run the following commands until successful:
cd funaudiollm-app git submodule update --init --recursive
-
Prepare the environments needed by the submodules as per instructions in the CosyVoice and SenseVoice repositories. Alternatively, if you have pre-existing setups, modify the resource path configuration in the
app.py
file (lines 15-18) accordingly. -
Finally, execute the code below to install the required packages:
pip install -r requirements.txt
Basic Usage
Preparation
- Obtain a Dashscope API token.
- Acquire the necessary pem file.
Voice Chat
To run the Voice Chat application:
cd voice_chat
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt
Access the application via: https://YOUR-IP-ADDRESS:60001/
Voice Translation
To execute the Voice Translation application:
cd voice_translation
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt
Access it through: https://YOUR-IP-ADDRESS:60002/
Enjoy exploring the dynamic capabilities of FunAudioLLM-APP, where technology meets communication in novel ways!