vocab-coverage
The project investigates how AI models grasp Chinese through analyses of character recognition and word vector distribution. It examines character mappings and Unicode representations to reveal how these models, including BERT, ERNIE, and XLM-Roberta, compare in handling Chinese semantics versus languages like Japanese and Korean. The findings serve as a crucial reference for future evaluations.