diffwave
DiffWave is a diffusion-based neural vocoder known for transforming Gaussian noise into high-quality speech through iterative refinement. It utilizes log-scaled Mel spectrograms for precise control, and supports features such as fast inference, multi-GPU training, and mixed-precision training. Recent updates include unconditional waveform synthesis and a fast sampling algorithm. With pretrained models and audio samples readily available, DiffWave offers a robust solution for both research and practical speech synthesis tasks.