Pretrained Language Model
The Pretrained Language Model repository, developed by Huawei Noah’s Ark Lab, provides cutting-edge pretrained language models along with their respective optimization techniques. This collection particularly emphasizes models suitable for Chinese language processing tasks.
Overview of the Directory
PanGu-α
PanGu-α is a monumental autoregressive Chinese language model featuring up to 200 billion parameters. This model is developed using MindSpore and optimized on Ascend 910 AI processors, highlighting its capability to handle vast amounts of data efficiently.
NEZHA
The NEZHA-TensorFlow variant of the NEZHA model achieves state-of-the-art results in various Chinese natural language processing tasks using TensorFlow. An alternative version, NEZHA-PyTorch, caters to PyTorch users, offering the same performance excellence.
NEZHA-Gen
This project includes NEZHA-Gen-TensorFlow, featuring two specific GPT models. The first, Yuefu (乐府), is designed for generating Chinese classical poetry, while the second is a general-purpose Chinese GPT model.
TinyBERT and its Variants
TinyBERT represents a compressed BERT model that is 7.5 times smaller in size and 9.4 times faster during inference, making it efficient for deployment. Its counterpart, TinyBERT-MindSpore, is tailored for MindSpore.
DynaBERT
DynaBERT is a flexible BERT model with adjustable width and depth, catering to diverse computational needs.
BBPE
A specialized tool exists for constructing byte-level vocabulary known as BBPE, which also includes an associated tokenizer.
PMLM
PMLM offers a probabilistically masked language model, acting as a simpler approximation to XLNet without the complex two-stream self-attention mechanism.
TernaryBERT
Developed under PyTorch, TernaryBERT uses a weights ternarization method, which also has a version for MindSpore: TernaryBERT-MindSpore.
HyperText
HyperText applies hyperbolic geometry principles to create a model efficient at text classification tasks.
BinaryBERT
BinaryBERT utilizes a weights binarization strategy through ternary weight splitting within the BERT model framework, also developed under PyTorch.
AutoTinyBERT
AutoTinyBERT is a versatile model zoo designed to meet various latency demands.
PanGu-Bot
PanGu-Bot leverages the GPU implementation of PanGu-α to develop an open-domain Chinese dialogue model.
CeMAT
CeMAT serves as a comprehensive sequence-to-sequence multilingual model for both autoregressive and non-autoregressive machine translation tasks.
Noah_WuKong
Noah_WuKong combines a large-scale dataset with benchmark models for Chinese vision-language tasks, available both in a standard version and a MindSpore version.
CAME
The Confidence-guided Adaptive Memory Efficient Optimizer, CAME, enhances model training efficiency under various conditions.
This repository is a comprehensive toolkit for researchers and developers focusing on Chinese natural language processing, offering diverse models and tools to suit different needs and platforms.