Overview of GPT Voice Conversation Chatbot (GPT-VCC)
The GPT Voice Conversation Chatbot, also known as GPT-VCC, is a dynamic tool designed to facilitate engaging and emotive conversations with ChatGPT or GPT-4. What sets this project apart is its ability to allow users to communicate using voice input via a microphone, offering an experience akin to speaking with a friendly and knowledgeable assistant. Additionally, users can interact using a terminal if they prefer typing over speaking.
The project employs a modified GPT chat preset and makes use of the ChatGPT API by default. This enables it to maintain a conversation flow, remembering the context of users’ earlier interactions during a session. Users also have the option to allow the bot to develop a memory over time, enhancing the personalization of interactions.
Installation
Before delving into the installation steps, it's crucial for users to obtain an OpenAI API key, available by creating an account at the OpenAI website.
For Windows Users:
- Python Installation: First, download and install Python from the official website.
- Downloading the Repository: Obtain the project repository either by cloning it using Git, downloading it as a ZIP file, or accessing it through the release sections.
- Setting Up: Extract the files and navigate into the folder containing them.
- Terminal Access: Open the terminal by right-clicking within the folder space and selecting 'Open in Terminal'. If Windows Terminal is not available, use PowerShell instead. Then, install the required dependencies using the command:
pip install -r requirements.txt --upgrade
- Proceed with Usage Steps: Once preparations are complete, follow the instructions in the 'Using GPT-VCC' section.
For Linux Users (Debian/Ubuntu-based):
- Install Pip3:
sudo apt install python3-pip
- Download and Extract Files: As with Windows, obtain and extract the files from the repository.
- Handling Pyaudio: Remove
pyaudio==0.2.13
from therequirements.txt
file. - Install Pyaudio and Espeak: Use apt to install Pyaudio and Espeak for voice functionality.
sudo apt install python3-pyaudio sudo apt install espeak
- Install Other Requirements: Use pip to install other dependencies.
pip3 install -r requirements.txt --upgrade
Usage Instructions
To begin a conversation using GPT-VCC, navigate to the bot’s folder and execute the following command, replacing <key>
with your API key:
python main.py <key>
Alternatively, input the key in the keys.txt
file for ease of access. Upon starting the script, the bot will load the key from this file if it is present.
When the application runs, a GUI powered by Pygame appears. The GUI’s color indicates the bot’s state: red means it’s not listening, yellow indicates it’s loading, and green signifies the bot is ready and listening. Speak when the green light is on, and your words will be converted to text and processed by GPT.
To utilize the bot via terminal instead, execute:
python gptcli.py <key>
Special Features and Voice Commands
The bot incorporates several customizable features and voice commands:
- Voice Commands and Customizations: Users can adjust parameters like tokens, creativity, and use of specific TTS (Text-to-Speech) options.
- Personalization: Set personal or task-related presets, and save memory of interactions.
- Enhanced TTS Options: Opt between Google's TTS, ElevenLabs' more lifelike TTS, or a simple robotic voice.
Example Use Cases
GPT-VCC serves multiple practical purposes:
- It can act as a language tutor, aiding in pronunciation and grammar.
- Provide programming assistance and feedback.
- Extend a user-friendly platform for completing tasks like drafting cover letters.
The flexibility and customization offered make it not only a tool for casual conversation but also a powerful assistant for educational and professional contexts.