oss-fuzz-gen - AI-Based Fuzz Target Generation for Software Security

A Framework for Fuzz Target Generation and Evaluation

The oss-fuzz-gen project introduces an innovative framework designed to generate fuzz targets for real-world programming languages such as C, C++, Java, and Python. This framework utilizes various Large Language Models (LLMs) to create these fuzz targets and rigorously tests them using the well-known OSS-Fuzz platform.

Supported Models

The framework supports several models, including:

Vertex AI code-bison series (code-bison, code-bison-32k)
Gemini series (Pro, Ultra, Experimental, 1.5)
OpenAI's GPT series (GPT-3.5-turbo, GPT-4, GPT-4o, and Azure variants)

These models are leveraged to produce fuzz targets that aim to enhance the testing of software by identifying vulnerabilities in the code.

Evaluation Metrics

After generating fuzz targets, the framework evaluates them using four main metrics against current data from production environments:

Compilability: Checks if the generated fuzz targets can be compiled without errors.
Runtime Crashes: Assesses the stability of the fuzz targets when executed.
Runtime Coverage: Measures the extent of code execution achieved by the fuzz targets.
Runtime Line Coverage Difference: Compares the coverage of these targets to existing human-written options within OSS-Fuzz.

In an experiment conducted on January 31, 2024, the framework successfully demonstrated its capability by handling over 1300 benchmarks across 297 open-source projects.

Achievements

One of the noteworthy accomplishments of the framework is its ability to effectively use LLMs to generate valid fuzz targets for 160 C/C++ projects, achieving a maximum line coverage increase of 29% over the existing manual targets. However, reports generated are not publicly accessible as they may include undisclosed vulnerabilities.

Real-world Impact

Since its development, the framework has unveiled 26 new bugs or vulnerabilities via its automatically generated fuzz targets. Notable discoveries include out-of-bound reads, out-of-bound writes, and the use of uninitialized memory across various projects such as cJSON, libplist, hunspell, and zstd.

Top Coverage Improvements

The project has significantly boosted coverage for several open-source projects. The top performers include:

tinyxml2 with a 29.84% increase
inih with a 29.67% increase
lodepng with a 26.21% increase
Other notable mentions are libarchive, cmark, and fribidi, achieving 23.39%, 21.61%, and 18.20% respectively.

Coverage percentage is calculated against the total lines of source code compiled during the OSS-Fuzz build process for each project.

How to Get Involved

For those interested in contributing or collaborating with this project, whether in research or as part of the open-source community, feel free to reach out by creating an issue on their platform or through email: [email protected]. The framework continues to seek innovative minds to further its potential and discover more hidden vulnerabilities in software.