Introduction to edge-tts
The edge-tts
project is an impressive Python module designed to harness Microsoft Edge's online text-to-speech service. This functionality is accessible both within Python code and via convenient command line tools such as edge-tts
and edge-playback
. This module is ideal for developers and users who want to incorporate text-to-speech capabilities in a straightforward manner using Microsoft’s robust technology.
Installation
For developers interested in using edge-tts
, the installation is quite simple. The module can be installed via pip, the package manager for Python, using the following command:
$ pip install edge-tts
For those who prefer using the command line utilities without directly involving Python code, installing via pipx is recommended:
$ pipx install edge-tts
Usage
Basic Usage
Once installed, using edge-tts
is straightforward. To convert text to speech and save it to a media file, you can use the command:
$ edge-tts --text "Hello, world!" --write-media hello.mp3 --write-subtitles hello.vtt
If immediate playback with subtitles is required, the edge-playback
command is available. However, it necessitates the installation of the command-line player mpv
:
$ edge-playback --text "Hello, world!"
It’s noteworthy that all commands applicable to edge-tts
are similarly applicable to edge-playback
.
Changing the Voice
To alter the speech's language or the voice itself, the available options can be listed using:
$ edge-tts --list-voices
This command will display various voices, along with details like locale and gender. To specify a different voice, you might use:
$ edge-tts --voice ar-EG-SalmaNeural --text "مرحبا كيف حالك؟" --write-media hello_in_arabic.mp3 --write-subtitles hello_in_arabic.vtt
Custom SSML
It's important to note that as of version 5.0.0, support for custom SSML (Speech Synthesis Markup Language) has been removed. This change arose because Microsoft updated their service, making SSML unsupported via this tool.
Adjusting Rate, Volume, and Pitch
The module also allows for subtle alterations to speech generation, such as adjusting the rate, volume, and pitch. For example:
$ edge-tts --rate=-50% --text "Hello, world!" --write-media hello_with_rate_halved.mp3 --write-subtitles hello_with_rate_halved.vtt
$ edge-tts --volume=-50% --text "Hello, world!" --write-media hello_with_volume_halved.mp3 --write-subtitles hello_with_volume_halved.vtt
$ edge-tts --pitch=-50Hz --text "Hello, world!" --write-media hello_with_pitch_halved.mp3 --write-subtitles hello_with_pitch_halved.vtt
It's critical to use an equal sign (e.g., --rate=-50%
) to ensure the correct input of parameters.
Note on edge-playback Command
As a convenient wrapper, edge-playback
utilizes edge-tts
functionalities for immediate audio playback after synthesis. This tool accepts the same arguments as edge-tts
, providing an integrated speech package.
Using edge-tts as a Python Module
For those who prefer deeper integration, edge-tts
can be employed directly within Python scripts. This allows for more customization and automation in application development. The project hosts various example applications which can be accessed at:
In summary, edge-tts
provides a powerful, user-friendly interface for incorporating text-to-speech capabilities into both standalone applications and Python scripts, backed by Microsoft’s high-quality voice technology.