zenml - Effortless Cloud Integration for Data Science Teams

Introduction to ZenML

ZenML is an open-source MLOps framework that aims to make it easier for data science teams to connect seamlessly to cloud infrastructure. By providing a simple yet powerful way to create machine learning pipelines, ZenML helps standardize and streamline ML practices.

Key Features

Simple, Integrated, End-to-End MLOps

ZenML allows data scientists and ML engineers to build machine learning pipelines with minimal code adjustments. By adding @step and @pipeline decorators to existing Python functions, users can quickly set up workflows. For example, data loading and model training can be wrapped in steps and combined into a pipeline:

from zenml import pipeline, step

@step
def load_data() -> dict:
    training_data = [[1, 2], [3, 4], [5, 6]]
    labels = [0, 1, 0]
    return {'features': training_data, 'labels': labels}

@step
def train_model(data: dict) -> None:
    total_features = sum(map(sum, data['features']))
    total_labels = sum(data['labels'])
    
    print(f"Trained model using {len(data['features'])} data points. "
          f"Feature sum is {total_features}, label sum is {total_labels}")

@pipeline
def simple_ml_pipeline():
    dataset = load_data()
    train_model(dataset)

Easy Cloud Integration

ZenML allows running pipelines on various platforms such as AWS, GCP, Azure, and Kubernetes without changing the code, making it accessible to those less familiar with infrastructure intricacies.

Users can deploy a remote stack using one-click deployment through the command line or a dashboard:

zenml stack deploy --provider aws

Users can also register pre-existing infrastructure seamlessly:

zenml stack register <STACK_NAME> --provider aws

Production Workloads Management

Once the MLOps stack is set up, running workloads on production infrastructure is straightforward:

zenml stack set <STACK_NAME>
python run.py

Define resources and settings directly in the code to leverage specific infrastructure capabilities:

from zenml.config import ResourceSettings, DockerSettings

@step(settings={
    "resources": ResourceSettings(memory="16GB", gpu_count="1", cpu_count="8"),
    "docker": DockerSettings(parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime")
})
def training(...):
    ...

Model and Artifact Tracking

ZenML ensures complete traceability and auditability. Users can track who created a model, when, with what data, and on which code version:

from zenml import Model

@step(model=Model(name="classification"))
def trainer(training_df: pd.DataFrame) -> Annotated["model", torch.nn.Module]:
    ...

Integration with Favorite Tools

ZenML can integrate with existing tools, avoiding vendor lock-in. Users can incorporate it into their workflows with tools like MLflow for experiment tracking or BentoML for deployment:

from bentoml._internal.bento import bento

@step(on_failure=alert_slack, experiment_tracker="mlflow")
def train_and_deploy(training_df: pd.DataFrame) -> bento.Bento:
    mlflow.autolog()
    ...
    return bento

Getting Started

To begin with ZenML, users can easily install it via PyPI and take a guided tour using the quickstart guide:

pip install "zenml[server]" notebook
zenml go

ZenML thus provides an efficient and flexible way to handle machine learning operations, bringing together simplicity and integration while supporting various cloud platforms and tools.