Decipher: AI-Driven Transcription Subtitles
Decipher is an innovative project that harnesses the power of artificial intelligence to automatically generate transcription subtitles for videos. By doing so, it eliminates the tedious process of manual transcription, making video content more accessible to a broader audience. At the heart of this project is the technology known as Whisper, developed by OpenAI.
What is Whisper?
Whisper is a cutting-edge speech recognition system created by OpenAI. It has been trained on an extensive dataset comprising 680,000 hours of multilingual and multitask supervised data gathered from the internet. This diverse information spectrum enhances Whisper's ability to accurately recognize spoken language despite variations in accents, background noise, or technical jargon.
Getting Started with Decipher
There are two primary methods to leverage Decipher's capabilities, catering to the user's preference:
Using Google Colab
Google Colab provides a convenient, cloud-based platform where users can execute machine learning and data science projects without needing a personal, powerful GPU. With a free Google account, users can access Google Colab, borrowing high-performance GPUs such as Tesla K80, T4, P4, or P100, for up to 12 hours per session. For extended usage, Colab Pro/Pro+ options are available. To access Decipher on Google Colab, simply follow the instructions within the Notebook linked.
Manual Installation
For those who prefer a manual setup, certain dependencies and tools are required:
- Python: Ensure Python is installed on your system.
- ffmpeg: This tool is critical for handling video files.
The installation process involves using Git to clone the Decipher repository and installing it via pip:
pip install git+https://github.com/dsymbol/decipher
or
git clone https://github.com/dsymbol/decipher
cd decipher && pip install .
Important Note: Do not use pip install decipher
as it refers to a different package.
Usage Options
GUI (Gradio) Interface
To use the graphical user interface powered by Gradio:
decipher gui
# or
python -m decipher gui
Command-line Interface
Decipher provides two main command-line operations:
- Transcription: Converts video audio into a SubRip Subtitle (SRT) file and can directly apply these subtitles to the video.
- Subtitling: Applies existing SRT files to videos without transcription, allowing for verification of previously generated subtitles.
Initiate the decipher command-line interface with:
decipher --help
or, if this doesn’t work:
python -m decipher --help
Example Commands
- To generate SRT subtitles for a video:
decipher transcribe -i video.mp4 --model small
- To burn generated subtitles into a video file:
decipher subtitle -i video.mp4 --subtitle_file video.srt --subtitle_action burn
- To generate and immediately burn subtitles without validation:
decipher transcribe -i video.mp4 --model small --subtitle_action burn
Decipher simplifies the creation and integration of subtitles, empowering creators to make their content accessible to audiences worldwide.