whisper-clip - Simplify Audio to Text Conversion with Direct Clipboard Access

WhisperClip: Simplified Audio Transcription

Introduction

WhisperClip is a user-friendly tool designed to effortlessly transform spoken words into written text. By harnessing the capabilities of OpenAI's Whisper, this application simplifies audio transcription with just a single click. Once transcribed, the text is conveniently saved to your clipboard, so it's ready to be pasted wherever you need it. This makes WhisperClip an accessible solution for anyone wanting to convert audio to text quickly and easily.

Features

WhisperClip boasts a variety of features to enhance user experience:

One-Click Recording: Start transcribing with a simple click.
Automatic Transcription: Uses OpenAI Whisper for seamless transcription, free of charge.
Clipboard Integration: Transcriptions can be instantly saved to your clipboard for easy access.

Installation Steps

Prerequisites

Before installing WhisperClip, ensure your system meets the following requirements:

Python version 3.8 or higher.
While not mandatory, having CUDA installed is highly recommended for better performance. However, WhisperClip can also operate on a CPU.

Setting Up the Environment

Begin by cloning the repository from GitHub:

git clone https://github.com/gustavostz/whisper-clip.git
cd whisper-clip

Next, install PyTorch if it’s not already installed. Instructions for this can be found on PyTorch's website.
Install other necessary dependencies with:
```
pip install -r requirements.txt
```

Choosing the Right Model

Choosing the appropriate Whisper model is crucial for optimal performance based on your system’s GPU VRAM. The available models are:

Size	Required VRAM	Relative speed
tiny	~1 GB	~32x
base	~1 GB	~16x
small	~2 GB	~6x
medium	~5 GB	~2x
large	~10 GB	1x

For applications involving only English, models with .en like tiny.en or base.en generally yield better performance. You can change the model by editing the model_name variable in the config.json file.

Usage

To use the application:

Launch the application with the following command:
```
python main.py
```
Click the microphone button to start and stop recording.
If the "Save to Clipboard" option is enabled, the transcription will be automatically copied to your clipboard upon completion.

Configuration

The default shortcut to toggle recording is Alt+Shift+R, but this can be altered in the config.json file.
You can also switch the Whisper model used for transcription in the same config.json file.

Feedback and Suggestions

If there’s an interest in developing a more user-friendly, executable version of WhisperClip, feedback is much appreciated. Feel free to share suggestions and feedback through the GitHub issues page.

Acknowledgments

WhisperClip leverages the power of OpenAI's Whisper for superior audio transcription capabilities.