Project Icon

chinese_speech_pretrain

Chinese Speech Models Trained on Large-Scale Datasets for Enhanced Recognition

Product DescriptionThis project uses extensive Chinese audio data from sources like YouTube and Podcasts to train models such as wav2vec 2.0 and HuBERT via Fairseq. These models, available in BASE and LARGE versions, enhance speech recognition and are evaluated on datasets like Aishell and WenetSpeech. Accessible on Hugging Face, these models are suitable for diverse applications, showing improved performance in varied noise and recording settings.
Project Details