ChineseTtsTflite - Enhance Offline Chinese TTS using FastSpeech and Tacotron

Chinese TTS TF Lite - Project Introduction

Overview

The Chinese TTS TF Lite project is a text-to-speech (TTS) engine developed using Kotlin, JetPack Compose, and TensorFlow Lite. This engine stands out due to its ability to function entirely offline, a significant advantage for users with privacy concerns or those operating in network-restricted environments.

This TTS engine offers two distinct models for converting text to speech: FastSpeech and Tacotron. Both models are sourced from the TensorFlowTTS repository, a well-regarded library in the field of speech synthesis. The method for converting text to Pinyin comes from TensorflowTTS_chinese.

Given that the engine performs real-time inference to generate audio, it requires a reasonable amount of processing power from the user's device. FastSpeech delivers audio swiftly, although the resulting voice may sound less natural compared to Tacotron. On the other hand, Tacotron offers a more human-like audio output but demands more from the device's hardware, making it less practical for everyday use unless for testing purposes.

Manual Compilation Instructions

To manually compile the Chinese TTS TF Lite engine, users need:

Android Studio Version: 2021.2.1
Model Files: Download the models-tflite.7z archive, which contains necessary TFLite models. Extract the following files and place them in the appropriate directories:

├─app/src/main/assets
│      baker_mapper.json
│      fastspeech2_quan.tflite
│      mb_melgan.tflite
│      tacotron2_quan.tflite

TensorFlow Lite AAR Files: Download the trimmed versions of TensorFlow Lite's AAR files from the provided link, ensuring they are located in app/libs.
Build Command: Run the following command to compile the project:

./gradlew assembleRelease

Model Downloads and Inspections

For users interested in exploring or utilizing the models:

Models Download: Access and download from the release page.
- The models-tf.7z archive contains original TensorFlow TTS models for PC use.
- The models-tflite.7z contains converted TFLite models suitable for mobile devices.
Model Analysis: Use Netron to inspect the model structures visually.

Reducing TensorFlow Lite Size

For those concerned with application size, guidance is provided on how to reduce TensorFlow Lite's binary size. The download link for trimmed AAR files is available here, showing significant reductions:

tensorflow-lite-2.8.0.aar: from 5.4MB to 3.7MB (68.5%)
tensorflow-lite-select-tf-ops-2.8.0.aar: from 109.6MB to 14.8MB (13.5%)

References and Resources

The project's development is underpinned by several key resources:

The TensorFlowTTS repository.
The TensorflowTTS_chinese repository.
TensorFlow Lite's Android guide.
Examples of TF testing and TFLite conversion via Colab.
Google Pico TTS Source.

Testing & Conversion of Models

The conversion environment and steps are detailed as follows:

Environment Setup: Requires Ubuntu 20.04 LTS and Python 3.8.
Repository Clone: Clone TensorFlowTTS and install necessary packages.

$ git clone https://github.com/TensorSpeech/TensorFlowTTS.git
$ cd TensorFlowTTS
$ pip install .
$ pip install git+https://github.com/repodiac/german_transliterate.git

Model Extraction and Testing: Extract models from models-tf.7z and conduct testing using Python scripts.

$ cd models-tf
$ python test-h5.py

Model Conversion: Convert TensorFlow models to TensorFlow Lite format.

$ python convert-tflite.py

App Interface

An illustrative screenshot of the app’s interface is provided to aid understanding of its design and functionality. App Screenshot