CodeGeeX2 - Multilingual Code Generation with Improved Performance and Efficient Deployment

CodeGeeX2: A Powerful Multilingual Code Generation Model

CodeGeeX2 is the second generation of the multilingual code generation model CodeGeeX. Unlike its predecessor, which was fully trained on the domestic Huawei Ascend chip platform, CodeGeeX2 is built on the advanced ChatGLM2 architecture with added code pre-training. Thanks to the superior performance of ChatGLM2, CodeGeeX2 achieves enhanced performance across various metrics. Impressively, it offers +107% improved performance over the first generation CodeGeeX and surpasses models like StarCoder-15B with only 6 billion parameters.

Key Features of CodeGeeX2

Enhanced Code Capability: CodeGeeX2-6B, based on the ChatGLM2-6B language model, is pre-trained with 600 billion code data entries. This marks a significant upgrade over the first generation model, showing vast improvements in multiple programming languages on the HumanEval-X benchmark - a 35.9% Pass@1 success rate in Python, outperforming larger models like StarCoder-15B.
Model Features: Building on the ChatGLM2-6B model features, CodeGeeX2-6B supports English and Chinese inputs, offers a maximum sequence length of 8192, and features faster inference speeds than its predecessor. The model requires only 6GB of GPU memory when quantized, making it suitable for lightweight local deployment.
Comprehensive AI Programming Assistant: The CodeGeeX plugin has been upgraded to support more than 100 programming languages. New features include context completion and cross-file completion. Together with the interactive AI programming assistant Ask CodeGeeX, developers can troubleshoot various programming issues more efficiently, including code interpretation, translation, error correction, and documentation generation.
Open Licensing: The weights for CodeGeeX2-6B are fully open for academic research, with commercial use available upon registration.

User Guide

Quick Start: Begin quickly with CodeGeeX2-6B using the transformers library in Python. It requires just a few lines of code to set up and start generating code snippets based on prompts.
Launching a Gradio Demo: Users can launch a Gradio demo to interact with the model using simple commands and optional authentication for secure sessions.
Running an API: Fast API support allows users to run a server and integrate CodeGeeX2-6B into their applications.

Evaluations of Code Capability

CodeGeeX2 has shown significant improvements across several benchmarks:

HumanEval (Pass@1,10,100): CodeGeeX2 surpasses numerous competitors with a Pass@1 score of 35.9%.
HumanEval-X (Pass@1): Notable improvements in multiple programming languages aggregated into an overall advancement.
DS1000 (Pass@1): Routines using popular libraries show competitive performance with CodeGeeX2-6B holding its own even against larger models.

Efficient Deployment

CodeGeeX2-6B is tailored for efficient deployment:

Quantization: Supports reduced memory requirement while maintaining performance.
Inference Speed: Implements Multi-Query Attention and Flash Attention for speedy inference processing.

Licensing and Citation

The code of this project is available under the Apache-2.0 license, and the model weights abide by the Model License. For commercial use, interested parties can apply through the designated registration form.

For those interested in citing the project, please refer to the formal publication:

@inproceedings{zheng2023codegeex,
  title={CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X},
  author={Qinkai Zheng and Xiao Xia and Xu Zou and Yuxiao Dong and Shan Wang and Yufei Xue and Zihan Wang and Lei Shen and Andi Wang and Yang Li and Teng Su and Zhilin Yang and Jie Tang},
  booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={5673--5684},
  year={2023}
}