decipher - Use AI for effortless video transcription and automatic subtitling

Decipher: AI-Driven Transcription Subtitles

Decipher is an innovative project that harnesses the power of artificial intelligence to automatically generate transcription subtitles for videos. By doing so, it eliminates the tedious process of manual transcription, making video content more accessible to a broader audience. At the heart of this project is the technology known as Whisper, developed by OpenAI.

What is Whisper?

Whisper is a cutting-edge speech recognition system created by OpenAI. It has been trained on an extensive dataset comprising 680,000 hours of multilingual and multitask supervised data gathered from the internet. This diverse information spectrum enhances Whisper's ability to accurately recognize spoken language despite variations in accents, background noise, or technical jargon.

Getting Started with Decipher

There are two primary methods to leverage Decipher's capabilities, catering to the user's preference:

Using Google Colab

Google Colab provides a convenient, cloud-based platform where users can execute machine learning and data science projects without needing a personal, powerful GPU. With a free Google account, users can access Google Colab, borrowing high-performance GPUs such as Tesla K80, T4, P4, or P100, for up to 12 hours per session. For extended usage, Colab Pro/Pro+ options are available. To access Decipher on Google Colab, simply follow the instructions within the Notebook linked.

Manual Installation

For those who prefer a manual setup, certain dependencies and tools are required:

Python: Ensure Python is installed on your system.
ffmpeg: This tool is critical for handling video files.

The installation process involves using Git to clone the Decipher repository and installing it via pip:

pip install git+https://github.com/dsymbol/decipher

git clone https://github.com/dsymbol/decipher
cd decipher && pip install .

Important Note: Do not use pip install decipher as it refers to a different package.

Usage Options

GUI (Gradio) Interface

To use the graphical user interface powered by Gradio:

decipher gui
# or
python -m decipher gui

Command-line Interface

Decipher provides two main command-line operations:

Transcription: Converts video audio into a SubRip Subtitle (SRT) file and can directly apply these subtitles to the video.
Subtitling: Applies existing SRT files to videos without transcription, allowing for verification of previously generated subtitles.

Initiate the decipher command-line interface with:

decipher --help

or, if this doesn’t work:

python -m decipher --help

Example Commands

To generate SRT subtitles for a video:

decipher transcribe -i video.mp4 --model small

To burn generated subtitles into a video file:

decipher subtitle -i video.mp4 --subtitle_file video.srt --subtitle_action burn

To generate and immediately burn subtitles without validation:

decipher transcribe -i video.mp4 --model small --subtitle_action burn

Decipher simplifies the creation and integration of subtitles, empowering creators to make their content accessible to audiences worldwide.