Introduction to ospeak
ospeak is a command-line interface (CLI) tool designed to convert written text into speech using the OpenAI Text to Speech API. This tool is capable of vocalizing text directly through your terminal or saving the audio output to a file. It is particularly beneficial for developers or users who need a straightforward method to implement text-to-speech functionality without leaving their development environment.
Installation
To install ospeak, the recommended method is to use pipx
, which helps in managing Python applications in isolated environments:
pipx install ospeak
However, if you're a MacOS user, there's a known compatibility issue with Python 3.12. To work around this, install ospeak using Python 3.11 like so:
pipx install --python /path/to/python3.11 ospeak
Additionally, ospeak requires ffmpeg
, a tool for handling multimedia data. MacOS users can easily install ffmpeg
using Homebrew:
brew install ffmpeg
It's important to install ospeak and other tools like LLM in separate virtual environments due to some dependency issues. Here, pipx
is useful as it maintains isolated environments for applications.
Usage
ospeak is designed to be user-friendly. To make your computer speak a simple sentence, all you need to do is:
ospeak "Hello there"
Before using this, you need an OpenAI API key, which can be set up as an environment variable:
export OPENAI_API_KEY="..."
Alternatively, you can provide the API key directly in the command:
ospeak --token "..." "Hello there"
ospeak also allows you to pipe content into it:
echo "Hello there" | ospeak
Customization Options
-
Voices: Choose different voices with the
-v/--voice
option. Available voices includealloy
,echo
,fable
,onyx
,nova
, andshimmer
. Use-v all
to hear a demonstration of all voice options.ospeak "This is my voice" -v all
-
Models: Enhance audio quality using the
-m/--model
option. The default model istts-1
, but for higher quality, choose-m tts-1-hd
.ospeak "This is higher quality" -m tts-1-hd
-
Speed: Adjust the speaking speed with
-x/--speed
. Valid values range from 0.25 to 4, where 1.0 is the default.ospeak "This is my fast voice" -x 2
-
Output: Save the spoken audio to a file with
-o/--output
, specifying a filename ending in.mp3
or.wav
.ospeak "This is my voice" -o voice.mp3
If you also want to hear the audio live while saving, add the
-s/--speak
option.ospeak "This is my voice" -o voice.mp3 -s
Help and Support
Getting help is simple. Run the following to view all available options and commands:
ospeak --help
Development and Contribution
For those interested in contributing to the development of ospeak, start by checking out the code. Then set up a new virtual environment and install the necessary dependencies:
cd ospeak
python -m venv venv
source venv/bin/activate
pip install -e '.[test]'
Run tests to ensure everything is working:
pytest
ospeak offers a convenient and efficient way to enable text-to-speech functionality directly from the command line, making it an invaluable tool for anyone needing this capability in their daily interaction with technology.