WhisperClip: Simplified Audio Transcription
Introduction
WhisperClip is a user-friendly tool designed to effortlessly transform spoken words into written text. By harnessing the capabilities of OpenAI's Whisper, this application simplifies audio transcription with just a single click. Once transcribed, the text is conveniently saved to your clipboard, so it's ready to be pasted wherever you need it. This makes WhisperClip an accessible solution for anyone wanting to convert audio to text quickly and easily.
Features
WhisperClip boasts a variety of features to enhance user experience:
- One-Click Recording: Start transcribing with a simple click.
- Automatic Transcription: Uses OpenAI Whisper for seamless transcription, free of charge.
- Clipboard Integration: Transcriptions can be instantly saved to your clipboard for easy access.
Installation Steps
Prerequisites
Before installing WhisperClip, ensure your system meets the following requirements:
- Python version 3.8 or higher.
- While not mandatory, having CUDA installed is highly recommended for better performance. However, WhisperClip can also operate on a CPU.
Setting Up the Environment
-
Begin by cloning the repository from GitHub:
git clone https://github.com/gustavostz/whisper-clip.git cd whisper-clip
-
Next, install PyTorch if it’s not already installed. Instructions for this can be found on PyTorch's website.
-
Install other necessary dependencies with:
pip install -r requirements.txt
Choosing the Right Model
Choosing the appropriate Whisper model is crucial for optimal performance based on your system’s GPU VRAM. The available models are:
Size | Required VRAM | Relative speed |
---|---|---|
tiny | ~1 GB | ~32x |
base | ~1 GB | ~16x |
small | ~2 GB | ~6x |
medium | ~5 GB | ~2x |
large | ~10 GB | 1x |
For applications involving only English, models with .en
like tiny.en
or base.en
generally yield better performance. You can change the model by editing the model_name
variable in the config.json
file.
Usage
To use the application:
- Launch the application with the following command:
python main.py
- Click the microphone button to start and stop recording.
- If the "Save to Clipboard" option is enabled, the transcription will be automatically copied to your clipboard upon completion.
Configuration
- The default shortcut to toggle recording is
Alt+Shift+R
, but this can be altered in theconfig.json
file. - You can also switch the Whisper model used for transcription in the same
config.json
file.
Feedback and Suggestions
If there’s an interest in developing a more user-friendly, executable version of WhisperClip, feedback is much appreciated. Feel free to share suggestions and feedback through the GitHub issues page.
Acknowledgments
WhisperClip leverages the power of OpenAI's Whisper for superior audio transcription capabilities.