onnx2c - Efficiently Run Machine Learning Models on Microcontrollers by Converting ONNX to C Code

Introduction to Onnx2c

Onnx2c is a sophisticated tool that serves as a bridge between neural networks and microcontrollers. Acting as a compiler, it translates ONNX (Open Neural Network Exchange) models into C code, allowing the integration of machine learning capabilities into embedded systems, specifically targeting microcontrollers. This capability forms the heart of the "Tiny ML" movement, which seeks to perform machine learning tasks on devices with limited resources.

Key Features

Onnx2c is crafted to simplify the deployment of neural networks on microcontrollers. Several key features highlight its design:

Minimal Dependencies: The generated code refrains from using standard input-output operations (printf()) by avoiding the inclusion of <stdio.h>. It relies solely on the standard C math library, making it highly portable.
Memory Management: It intelligently allocates memory at compile-time instead of using dynamic memory allocation, ensuring predictable and efficient memory usage.
Compiler Optimization Friendly: The generated code is designed to allow C compilers to optimize it to the fullest extent.
Single File Output: To aid project management, all required C code is contained within a single file, making integration straightforward.

Design Philosophy

Onnx2c is built with the intention of minimizing complexity for users. If a user can export their trained neural network to an ONNX file using frameworks like PyTorch or TensorFlow, onnx2c provides a straightforward path to incorporate this model into a microcontroller project.

However, certain features are intentionally not included in onnx2c to maintain focus on its primary objective:

Broad ONNX Specification Coverage: Currently, onnx2c implements only 91 of the 166 ONNX operands, focusing on those most relevant to "Tiny ML".
Hardware Accelerators: Usage of accelerators is not within the scope of onnx2c.
Training on Device: Onnx2c is designed for inference only, not for training models or backpropagation.

Building the Project

To build onnx2c, users need to ensure that ProtocolBuffers libraries are installed. This can be done with package managers like apt on Ubuntu or brew on MacOS. The building process involves cloning the repository, updating submodules, and using CMake for building:

git clone https://github.com/kraiskil/onnx2c.git
cd onnx2c
git submodule update --init
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make onnx2c

Usage Instructions

After building, the onnx2c binary can be used directly by providing it with an ONNX model file:

./onnx2c [your ONNX model file] > model.c

The output model.c will contain a function void entry(...), which should be called from the main program to perform inference, with arguments corresponding to the model's inputs.

Using compiler optimizations such as -ffast-math can enhance computation speed significantly.

Optimization and Performance

Onnx2c includes several optimization techniques to improve the efficiency of the generated code:

Tensor Unionization: Reduces memory usage by combining intermediate tensors.
Eliminating Cast Nodes: Simplifies output by adjusting preceding node outputs.
AVR Processor Optimizations: Places constants in instruction memory for space efficiency.
Experimental Quantization: Converts floating point calculations to integers, reducing computation cost.

The performance of code generated by onnx2c has been benchmarked on microcontrollers such as STM32, showing efficient memory usage and fast inference times compared to other tools like STM32CubeAI.

Logging and Debugging

Onnx2c supports configurable logging to assist in debugging and optimization, with levels ranging from fatal error reporting to detailed traces of execution steps. Additional scripts and documentation are provided for running .onnx models on development boards, helping users assess if their network fits the target hardware before starting extensive training.

Development and Future Enhancements

Documentation for developers is available, providing insights into testing and improvement processes. Continuous benchmarks and tests aim to further refine onnx2c’s capabilities and performance, with user feedback driving future updates and features.

In summary, onnx2c is a versatile tool aimed at seamlessly enabling machine learning on resource-constrained devices, embodying the principles of simplicity, efficiency, and ease of integration.