Introduction to Video-Subtitle-Extractor
Video-Subtitle-Extractor (VSE) is a user-friendly software designed to extract embedded hard subtitles from videos and convert them into separate subtitle files in the .srt format. This tool is particularly beneficial for those who need to retrieve subtitles from videos without the need for manual transcription. Let's dive into some key features and functionalities of this project.
Key Features
-
Extracting Key Frames: VSE efficiently identifies and extracts key frames from a video, which are essential for subtitle identification.
-
Text Detection: It locates the position of text within these video frames, focusing primarily on subtitle regions.
-
OCR Text Recognition: The software uses Optical Character Recognition (OCR) to accurately recognize and convert the detected text into editable text.
-
Text Filtering: It filters out non-subtitle text to ensure that only relevant subtitle text is processed and extracted.
-
Watermark and Logo Removal: Through integration with the video-subtitle-remover tool, VSE can remove visual elements like watermarks, logos, and even the hardcoded subtitles from original videos.
-
Subtitle De-duplication: The software removes duplicate subtitle lines to produce clean srt files. Users who prefer can also generate text files by changing the configuration in
backend/config.py
. -
Batch Processing: VSE supports processing multiple videos simultaneously, saving time when dealing with large numbers of videos.
-
Multilingual Support: The tool can handle subtitles in 87 languages, including Simplified and Traditional Chinese, English, Japanese, Korean, and many more.
Operation Modes
VSE offers three operational modes tailored to different user needs:
-
Quick Mode: Uses lightweight models for fast extraction. Some subtitles might be missed or have minor errors, but it ensures speedy processing.
-
Auto Mode: Automatically selects the model based on the hardware. It uses lightweight models on CPUs for speed and precise models on GPUs for accuracy, though at a slower pace.
-
Accurate Mode: Employs detailed models that ensure no loss and minimal error in subtitles, but it requires substantial processing time and is not recommended for most users unless necessary.
Preference should be given to Quick/Auto modes, and Accurate mode should be a last resort if there is significant subtitle loss in the first two modes.
Unique Characteristics
-
Local OCR Processing: VSE performs OCR processing locally, eliminating the need for external API calls and subscriptions to services like Baidu or Alipay.
-
GPU Acceleration: For users equipped with Nvidia GPUs, VSE offers accelerated subtitle extraction with higher accuracy and speed.
User Guide
-
Discussion Groups: If users encounter any issues, they can join the discussion groups on QQ: 210150985, 816881808.
-
Usage Tips:
- To start, click "Open" to select the video file, adjust the subtitle area as needed, and click "Run".
- For batch extraction, ensure all videos selected have consistent resolution and subtitle regions.
-
Watermark and Text Customization: Edit the
backend/configs/typoMap.json
file to remove or replace specific text from the video content. Ensure video paths are free of any non-English characters or spaces to prevent errors. -
Recommended Configurations:
- Use the Windows green version for a faster startup.
- For GPU users, the Windows GPU version significantly enhances extraction speed.
How to Get Started
- Downloads:
- Windows CPU version: vse_windows_cpu_v2.0.0.zip (Password: vse2)
- Windows GPU version: vse_windows_gpu_v2.0.0.7z
- MacOS CPU version: vse_macOS_CPU.dmg
Community and Support
- Developing Tools: Developed with the support of JetBrains tools.
- Improvement Suggestions: Users can contribute by proposing improvements and discussing issues on the project's GitHub page.
In summary, Video-Subtitle-Extractor simplifies the process of extracting subtitles from videos. Whether for personal enjoyment, language studies, or media projects, VSE provides a robust solution to meet various user needs. Its multilingual support and user-friendly interface make it a versatile tool suitable for diverse applications.