DiffSinger
DiffSinger provides a PyTorch-based implementation for singing voice synthesis, using a shallow diffusion mechanism and pre-trained FastSpeech2 Auxiliary Decoder. It supports multiple model configurations—naive, auxiliary, and shallow—with TTS features like pitch, volume, and rate control. Designed for single and batch inference with pretrained models, the repository offers detailed guidance for comprehensive synthesis and training, while still in development for multi-speaker training.