metavoice-src
MetaVoice-1B is a robust 1.2 billion parameter model for text-to-speech, emphasizing emotional speech rhythm and tone. It features zero-shot voice cloning for American and British accents and supports cross-lingual cloning with minimal data through fine-tuning. The model is optimized for swift inference and can be deployed on both local and cloud platforms. It is accessible via various interfaces including a web UI, Colab demo, and Hugging Face, and is available under the Apache 2.0 license for wide-reaching use without restrictions.