Thorsten-Voice - High-quality German TTS voice for offline use without licensing restrictions

Introduction to Thorsten-Voice Project

Thorsten-Voice logo

The Thorsten-Voice project is a fascinating initiative aimed at creating a free, high-quality, offline German text-to-speech (TTS) voice system that can be easily integrated into various projects without the hassle of licensing issues. The project centers around making accessible voice technology available to everyone.

Motivation for Thorsten-Voice Project

Thorsten Müller, the primary contributor of the Thorsten-Voice project, envisioned a world where high-quality TTS technology is accessible to all, irrespective of gender, orientation, or background. His contribution is rooted in his belief in equality and access to open knowledge across the globe. Though he admits he is not a professional voice talent, Thorsten passionately shares his voice recordings with the world to advance this vision.

Voice Datasets

Thorsten-Voice offers several openly available voice datasets, which can be freely downloaded from Zenodo. These datasets have been used to train various TTS models, reflecting different nuances and emotional expressions. Each dataset is carefully recorded and optimized by Thorsten Müller and Dominik Kreutz.

Thorsten-Voice Dataset 2021.02 (Neutral)

This dataset includes more than 23 hours of audio, featuring 22,668 recorded phrases. With an average phrase length of 52 characters, these recordings are optimized for clarity and quality, boasting a sample rate of 22,050Hz.

Thorsten-Voice Dataset 2021.06 (Emotional)

This emotional dataset was crafted with expressive variations, including 300 sentences across eight emotions, culminating in 2,400 distinct recordings. Each sentence is deliberately pronounced to embody a particular emotion, regardless of context.

Additional Datasets

The project also features datasets from 2022.10 (Neutral) and 2023.09 (Hessisch), alongside a comprehensive FULL 44kHz dataset celebrating the five-year anniversary of Thorsten-Voice. These datasets represent ongoing improvements and diversity in voice recordings.

TTS Models

Open source datasets have empowered the creation of numerous TTS models utilizing artificial intelligence and machine learning. These models are used in projects like Coqui AI, Piper TTS, and Home Assistant, making German-speaking TTS models broadly available. Thorsten-Voice also offers resources for neutral, emotional, and regional accents like Hessisch.

Thorsten-Voice YouTube Channel

For enthusiasts eager to delve deeper into voice technology, the Thorsten-Voice YouTube channel serves as a valuable resource, providing tutorials and demonstrations of open-source voice technology. The channel is a gathering point for community members interested in this field.

Conference Speaker and Public Talks

Thorsten Müller is passionate about discussing the significance of open-source voice technology and is open to speaking at events and conferences. His personal experiences and professional insights make compelling contributions to dialogues surrounding technology and accessibility.

Support and Thanks

Thorsten encourages support for the project in various ways, such as subscribing to his YouTube channel, following social media, or contributing via platforms like Ko-Fi. He extends gratitude to those who have supported the project, notably acknowledging partners and collaborators for their contributions.

In sum, the Thorsten-Voice project embodies the spirit of open-source development, aiming to democratize access to advanced voice technology through thoughtfully curated datasets and models.