wonnx - Rust-based GPU-Accelerated ONNX Inference Tool for Web Applications

Introduction to the Wonnx Project

Wonnx is a cutting-edge project offering a GPU-accelerated ONNX inference runtime, crafted entirely in Rust. Designed for ease of use on the web, it aims to provide swift and efficient inference solutions.

Supported Platforms

Wonnx leverages wgpu to enable cross-platform compatibility. It offers support for various platforms:

Windows: First-class support with Vulkan and DirectX 12 (Windows 10 only); work is ongoing for DirectX 11.
Linux & Android: Fully supported via Vulkan and Best Effort support through GLES3.
macOS & iOS: Supported through Metal.

Getting Started

Wonnx offers multiple ways to get started depending on your needs:

Using the Command Line

First, ensure that your system is compatible with Vulkan, Metal, or DX12 to access GPU features. Install the command line tool by running:

cargo install --git https://github.com/webonnx/wonnx.git wonnx-cli

The CLI tool, nnx, can be used to experiment with ONNX models easily.

Using Rust

To utilize Wonnx in a Rust project, include it as a dependency:

cargo add wonnx

Refer to the Wonnx Examples or explore the API documentation for guidance on usage.

Using Python

Install the Python package:

pip install wonnx

You can use it as follows:

from wonnx import Session
session = Session.from_path("../data/models/single_relu.onnx")
inputs = {"x": [-1.0, 2.0]}
assert session.run(inputs) == {"y": [0.0, 2.0]}

More build instructions are available in the wonnx-py README.

In the Browser with WebGPU + WebAssembly

Install the JavaScript package:

npm install @webonnx/wonnx-wasm

Then implement it on the client side:

import init, { Session, Input } from "@webonnx/wonnx-wasm";

await init();
const session = await Session.fromBytes(modelBytes);
const input = new Input();
input.insert("x", [13.0, -37.0]);
const result = await session.run(input);
session.free();
input.free();

This setup uses the browser's WebGPU and a WebAssembly module for execution. See the wonnx-wasm example for more detailed usage.

Running Other Models

Before using a model, simplify it using nnx prepare, or use external tools like onnx-simplifier.

Tested Models

Squeezenet
MNIST
BERT

GPU Selection

Set environment variables such as WGPU_ADAPTER_NAME, WGPU_BACKEND, and WGPU_POWER_PREFERENCE to choose a specific GPU adapter or change backend preferences.

Contributions

Wonnx is open to contributions, even from those new to deep learning, WGSL, or Rust. Contributors can learn and explore these technologies while working on Wonnx. General steps for implementing a new operator include modifying compiler.rs, writing WGSL templates, and creating tests.

Supported Operators

Wonnx supports a wide array of operators as per the ONNX IR standards. Each operator has its implemented versions ensuring compatibility and robustness for various applications.

In conclusion, Wonnx presents a robust and versatile solution for deploying ONNX models across multiple platforms. Whether you are working in a command line interface, developing a Rust or Python application, or even configuring it for a web-based environment, Wonnx offers comprehensive support and functionality.