Whisper Project Overview
Whisper is an open-source library developed by OpenAI, designed to transcribe audio into text. This tool aims to simplify the conversion of various audio or video formats into written text without the need for manual conversion.
Important Notice
The Whisper library has been independently developed with original coding, without plagiarizing existing work. Users requesting new features are encouraged to do so without altering the existing code and publishing it elsewhere without crediting the original creator, AZKADEV.
Current Status and Features
Whisper is still in development, with its current version being 1.6.2. Below is a breakdown of its features:
Cross-Platform Compatibility
- Fully supported on Linux and Android.
- Partially supported on CLI (Command Line Interface).
- Planned support for Windows, macOS, and other platforms such as the web and iOS.
Realtime Transcription
- Planned for all platforms including Android, Linux, macOS, CLI, Windows, web, and iOS.
Transcribing Capabilities
- Capable of transcribing all types of audio or video files without the need to manually convert them to the WAV format. This feature is still under development.
Testing and Device Compatibility
The Whisper library has undergone testing on various devices, including:
- REALME 5 with Lineage OS Android.
- MSI Modern 14 with Ubuntu 24.04.
- XIAOMI Redmi 4a with MIUI.
- An Acer device with Ubuntu Server 24.04.
Support and Development Needs
The development team encourages donations to support the enhancement of the Whisper project, especially for expanding support across additional platforms like Windows and macOS. Currently, development resources are limited to Linux and Android platforms.
Motivation for Redevelopment
The decision to rewrite Whisper was driven by the need to improve the original version, which was considered cluttered and inefficient. The creator aimed to ensure cross-platform functionality while incorporating the latest features of Whisper.cpp, which serves as the primary reference point.
Technical Functionality
Whisper relies on three foundational frameworks that ensure its operability:
-
Compatibility with DART: By starting from a command-line model, the library has been made flexible through JSON integration, simplifying its application across programs.
-
Integration with FLUTTER: Given FLUTTER's use of the DART programming language, compatibility extends naturally to DART implementations.
-
Web and WebAssembly (WASM) Integration: With adaptations to the basic code structure (SCHEMA), Whisper can potentially run on web platforms, although some features might be limited.
Invitation to Support
Community support is crucial for Whisper's evolution, notably through:
- Donations and Sponsorships: These contributions expedite the release of new capabilities and support on different platforms.
- Following Social Media: This benefits the creator financially and helps share insightful content related to development.
Installation and Getting Started
The installation and startup guides are in development, with Whisper being fully integrated with FLUTTER, ensuring seamless adaptation across platforms by simply adjusting import statements.
Credits
Whisper's development appreciates contributions from the original Whisper.cpp and GGML, acknowledging their input in inspiring this project.
Frequently Asked Questions
- Full Dart Support: With ample funding, full support for DART without native libraries is envisaged, aiming for high efficiency.
- Commercial Use Licensing: No license fees are required, though crediting the creator is recommended.
- Bot Integration: Whisper can potentially be integrated into bots like TELEGRAM, DISCORD, or WA, with CLI testing primarily conducted on Linux platforms.
Licensing
Whisper is licensed under the Apache License, Version 2.0, encouraging use and modification, provided that appropriate credits are maintained. For detailed terms, refer to the license agreement.
This comprehensive overview aims to convey Whisper's potential, progress, and aspirations, along with encouraging community participation for future development.