Introduction to Ortex
Ortex is a powerful tool designed to work with ONNX models, which are widely used in the field of machine learning. It acts as a bridge to the ONNX Runtime and offers easy deployment of ONNX models, allowing them to run efficiently and concurrently in a variety of settings, including distributed clusters. Ortex simplifies the management and implementation of models by providing both a streamlined deployment system and a storage-only tensor for easier manipulation of data.
Key Features
- ONNX Model-Compatible: Ortex seamlessly loads and facilitates fast inference of ONNX models, which are a standardized format and can be exported from popular machine learning libraries like PyTorch and TensorFlow.
- Multi-Backend Support: It takes advantage of different computational backends such as CUDA, TensorRT, Core ML, and ARM Compute Library, ensuring versatile performance on various platforms.
- Nx.Serving Integration: Ortex leverages the Nx.Serving framework for model deployment, enabling users to easily integrate models within their applications' supervision trees. This feature is particularly useful for running models in production environments.
- Dynamic Input Handling: When inspecting models, Ortex accommodates dynamic input sizes, as exemplified by axes marked with
nil
, representing variable dimensions.
Practical Examples
To illustrate the capabilities of Ortex, consider these examples:
-
Model Loading and Running:
Ortex allows users to load a model file, such as
resnet50.onnx
, and inspect its input and output specifications for an efficient setup:iex> model = Ortex.load("./models/resnet50.onnx") #Ortex.Model< inputs: [{"input", "Float32", [nil, 3, 224, 224]}] outputs: [{"output", "Float32", [nil, 1000]}]>
-
Inference Execution:
Execute the model to get results, such as identifying the maximum value index from the model's output:
iex> {output} = Ortex.run(model, Nx.broadcast(0.0, {1, 3, 224, 224})) iex> output |> Nx.backend_transfer() |> Nx.argmax #Nx.Tensor< s64 499 >
-
Serving a Model:
Ortex streamlines the process of deploying a model within a serving environment:
iex> serving = Nx.Serving.new(Ortex.Serving, model) iex> batch = Nx.Batch.stack([{Nx.broadcast(0.0, {3, 224, 224})}]) iex> {result} = Nx.Serving.run(serving, batch) iex> result |> Nx.backend_transfer() |> Nx.argmax(axis: 1) #Nx.Tensor< s64[1] [499] >
Installation
To start working with Ortex, users must first include it in their project dependencies. The following lines can be added to the mix.exs
file:
def deps do
[
{:ortex, "~> 0.1.9"}
]
end
Moreover, having Rust installed is necessary for successful compilation, aligning with Ortex's required environments.
By abstracting the complexities of deploying and managing ONNX models, Ortex empowers developers and machine learning practitioners to leverage powerful machine learning models with ease, enhancing their workflows and bringing efficient model inference capabilities to their applications.