starcoder2-self-align - Transparent Self-Alignment for Code Generation without Human Annotations

StarCoder2-Self-Align Project Overview

The StarCoder2-Self-Align project introduces a novel approach to code generation by presenting StarCoder2-15B-Instruct-v0.1. This is the first fully self-aligned code Large Language Model (LLM) developed using a completely permissive and open process. Unlike traditional models that rely on extensive human annotations, this project utilizes the StarCoder2-15B model to generate numerous instruction-response pairs. These pairs are then used to fine-tune StarCoder2-15B itself, making it a truly self-improving model.

Model and Resources

Model: The model used is titled StarCoder2-15B-Instruct-v0.1.
Code Repository: The project's code can be found at bigcode-project/starcoder2-self-align.
Dataset: The dataset associated with the project is self-oss-instruct-sc2-exec-filter-50k.
Authors: The project was developed by a team including Yuxiang Wei, Federico Cassano, Jiawei Liu, Yifeng Ding, Naman Jain, Harm de Vries, Leandro von Werra, Arjun Guha, and Lingming Zhang.

Quick Start Guide

For developers eager to dive in, StarCoder2-15B-Instruct-v0.1 can be utilized with the transformers library. A simple Python script is provided in the project documentation, demonstrating how to set up a text-generation task with the model. This script shows how to execute a Python function, such as implementing a quicksort algorithm with custom sorting criteria.

Data Generation Process

The data generation process for StarCoder2-Self-Align is innovative and automated, employing a compatible OpenAI server setup. Developers must ensure the necessary server is running to facilitate the generation and filtering of data required for training and evaluation.

Training Details

The project adopts efficient training methods to fine-tune the model. Key hyperparameters include using the Adafactor optimizer, a learning rate of 1e-5, and training over four epochs with a batch size of 64. Notably, the entire fine-tuning process can be performed on a single NVIDIA A100 80GB GPU, making it accessible for organizations with limited resources.

Evaluation

The model was evaluated across multiple benchmarks like EvalPlus, LiveCodeBench, and DS-1000, showcasing its capabilities in handling Python code generation tasks. These evaluations suggest that the model performs well in generating and executing code based on provided instructions.

Bias, Risks, and Limitations

Despite its innovative design, StarCoder2-15B-Instruct-v0.1 has certain limitations, especially concerning the strict adherence to given instructions and its performance across different programming languages. The model is primarily tuned for Python tasks. Developers may need to guide the model with specific examples to achieve the desired output format. Additionally, the model may carry forward biases inherent to the training data and base model.

For further information and detailed insights into the bias and limitations, the StarCoder2-15B model card offers comprehensive details. The developers present an encouraging move towards more transparent and self-reliant AI models, paving the way for future advancements in code generation technologies.