vall-e
Discover an unofficial PyTorch implementation VALL-E, leveraging EnCodec for audio tokenization in text-to-speech synthesis. This project supports experimenting with AR and NAR models, offering customizable configurations and synthesis scripts. While the pretrained model is pending, the framework allows in-depth exploration with GPU-supported DeepSpeed.