Introduction to Truss
Overview
Truss is an innovative tool designed to streamline the deployment of AI/ML models into production environments. It tackles the complexities of serving models efficiently and effectively, inspired by the "write once, run anywhere" philosophy. By packaging model code, data weights, and dependencies together, Truss ensures a seamless transition from development to production without facing deployment discrepancies.
Key Features
-
Write Once, Run Anywhere: Truss allows developers to consolidate everything their model needs into a single package. Whether in development or production, the model server behaves consistently, reducing the friction typically encountered when moving a model to production.
-
Fast Developer Feedback Loop: By eliminating the need for Docker or Kubernetes configurations, Truss provides an all-inclusive serving environment. This feature speeds up the development process, allowing developers to see the impact of their changes almost instantly via a live reload server.
-
Framework Agnostic: Truss supports a wide range of Python ML frameworks including popular ones like
transformers
,PyTorch
, andTensorFlow
. This makes it versatile enough to handle models from basic to complex architectures.
Examples and Use Cases
Truss showcases its ability by supporting well-known models such as:
- Llama 2 series, aimed at different parameter scales from 7B to 70B.
- Stable Diffusion XL for art generation.
- Whisper for speech transcription.
These examples, alongside dozens more, illustrate Truss’s flexibility and power in action.
Installation and Getting Started
To install Truss, a single command using pip suffices:
pip install --upgrade truss
Quickstart Guide
Here's a brief guide to start using Truss with a text classification model:
-
Create a Truss: Initialize a Truss environment for a text classification task by using the command:
truss init text-classification
After naming your Truss (e.g., "Text Classification"), navigate into the new directory:
cd text-classification
-
Implement the Model: In
model/model.py
, create aModel
class incorporating two functions:load()
andpredict()
. The former prepares the model for operation, while the latter handles inference:from transformers import pipeline class Model: def __init__(self, **kwargs): self._model = None def load(self): self._model = pipeline("text-classification") def predict(self, model_input): return self._model(model_input)
-
Add Dependencies: Define necessary dependencies in
config.yaml
. For a text classification pipeline, you might specify:requirements: - torch==2.0.1 - transformers==4.30.0
Deployment
Truss is supported by Baseten, which provides infrastructure for deploying models. To deploy your model using Baseten:
-
Get an API Key: Sign up on Baseten if you don't have an account. Generate an API key from the account settings.
-
Deployment Command: With the API key ready, execute the following to deploy:
truss push
Monitor the deployment via the Baseten model dashboard.
Invoking the Model
Post-deployment, you can run predictions through the terminal:
-
Invocation Command:
truss predict -d '"Truss is awesome!"'
-
Expected Response:
[ { "label": "POSITIVE", "score": 0.999873161315918 } ]
Community and Contribution
Truss is supported by Baseten and has been developed with contributions from various ML engineers globally. Notable contributors include Stephan Auerhahn and Daniel Sarfati. Community contributions are welcome, following the project's contributing guide and code of conduct.
Truss stands out as a versatile and efficient solution, simplifying the complexities of deploying and managing AI/ML models in production settings, empowering developers to focus on innovation and refinement.