lobe-tts - Reliable TTS and STT Library for Server and Browser Integration

Lobe TTS Project Introduction

📖 Introduction

Lobe TTS is an advanced Text-to-Speech (TTS) and Speech-to-Text (STT) library designed for both server and browser use. This library enables the conversion of text messages into audible speech, allowing users to interact as if they were speaking with a real person. It offers a variety of voice options, enhancing the user experience. Faced with the lack of a satisfactory TTS frontend library, the developers chose to create and open-source Lobe TTS to assist others in implementing TTS functionality in their applications.

Server-side: With minimal code (15 lines), users can achieve high-quality voice generation. Supports include EdgeSpeechTTS, MicrosoftTTS, OpenAITTS, and OpenAISTT.
Browser-side: It features React Hooks and visual audio components for tasks such as loading, playback control, and timeline interaction.

📦 Usage

Generate Speech on Server

To generate speech on a server, users can employ the EdgeSpeechTTS functionality provided by Lobe TTS. Here's an example of how to create a speech synthesis request and save the output to an MP3 file:

import { EdgeSpeechTTS } from '@lobehub/tts';
import { Buffer } from 'buffer';
import fs from 'fs';
import path from 'path';

const tts = new EdgeSpeechTTS({ locale: 'en-US' });

const payload = {
  input: 'This is a speech demonstration',
  options: {
    voice: 'en-US-GuyNeural',
  },
};

const response = await tts.create(payload);

const mp3Buffer = Buffer.from(await response.arrayBuffer());
const speechFile = path.resolve('./speech.mp3');

fs.writeFileSync(speechFile, mp3Buffer);

For Node.js environments, a WebSocket polyfill is necessary using the ws package.

Use the React Component

Lobe TTS provides components for React, such as AudioPlayer and AudioVisualizer. These components enable easy integration into applications for interactive audio features.

import { AudioPlayer, AudioVisualizer, useAudioPlayer } from '@lobehub/tts/react';

export default () => {
  const { ref, isLoading, ...audio } = useAudioPlayer(url);

  return (
    <Flexbox align={'center'} gap={8}>
      <AudioPlayer audio={audio} isLoading={isLoading} style={{ width: '100%' }} />
      <AudioVisualizer audioRef={ref} isLoading={isLoading} />
    </Flexbox>
  );
};

📦 Installation

To start using Lobe TTS, run the installation command for the package manager of your choice:

$ pnpm i @lobehub/tts
$ bun add @lobehub/tts

Compile with Next.js

For Next.js integration, add transpilePackages: ['@lobehub/tts'] to your next.config.js to ensure server-side rendering compatibility.

const nextConfig = {
  transpilePackages: ['@lobehub/tts'],
};

⌨️ Local Development

Lobe TTS can be developed locally. Users can clone the repository and set up a development environment using:

$ git clone https://github.com/lobehub/lobe-tts.git
$ cd lobe-tts
$ bun install
$ bun dev

Additionally, GitHub Codespaces is available for online development.

🤝 Contributing

Contributions are welcomed and encouraged. Developers interested in contributing can check the GitHub Issues page to get involved in the project.

🩷 Sponsor

Sponsors play a significant role in sustaining open-source projects. Contributions, whether small or large, help in driving the project's mission forward.

🔗 More Products

LobeHub, the team behind Lobe TTS, also offers other products like:

Lobe Chat: An extensible and high-performance chatbot framework.
Lobe Vidol: A platform for creating virtual idols with various interactive features.
Lobe Theme: A modern theme for stable diffusion web interfaces.

📝 License

Lobe TTS is open-source software licensed under the MIT License. More information can be found on the project's GitHub page.