BentoDiffusion - Efficient Self-hosting of Stable Diffusion Models Using BentoML

Introduction to BentoDiffusion Project

The BentoDiffusion project is a comprehensive example series by BentoML, showcasing how to deploy various models within the Stable Diffusion (SD) family. These models excel in generating and manipulating images or video clips by interpreting text descriptions.

What BentoDiffusion Offers

BentoDiffusion illustrates the process of deploying advanced models that can create vivid images based on textual input. By utilizing these examples, users can explore different aspects of artificial intelligence and machine learning, particularly in the realm of image generation and enhancement.

Prerequisites

To effectively run these models locally, a system equipped with an Nvidia GPU, boasting at least 12GB of VRAM, is highly recommended. This hardware ensures smooth model operation and efficient processing.

Setting Up Dependencies

Getting started with BentoDiffusion requires cloning the repository:

git clone https://github.com/bentoml/BentoDiffusion.git
cd BentoDiffusion/sdxl-turbo

Python 3.11 is the advised version for running this project. The required dependencies can be installed with:

pip install -r requirements.txt

Running the BentoML Service

The heart of this project lies in a pre-defined BentoML service within the service.py file. To activate the service, execute the following command in your project directory:

bentoml serve .

Upon starting, the service is accessible at http://localhost:3000, offering an interactive interface through Swagger UI and other methods. For instance, using CURL, one can initiate a request:

curl -X 'POST' \
  'http://localhost:3000/txt2img' \
  -H 'accept: image/*' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt": "A cinematic shot of a baby racoon wearing an intricate italian priest robe.",
  "num_inference_steps": 1,
  "guidance_scale": 0
}'

Similarly, a Python client can be configured for interaction:

import bentoml

with bentoml.SyncHTTPClient("http://localhost:3000") as client:
        result = client.txt2img(
            prompt="A cinematic shot of a baby racoon wearing an intricate italian priest robe.",
            num_inference_steps=1,
            guidance_scale=0.0
        )

Deploying to BentoCloud

Post local setup, the BentoML service can be scaled up by deploying it on BentoCloud, facilitating better management and enhanced accessibility. Users need to log into BentoCloud, which can be accessed by signing up here. Deployment involves a straightforward command:

bentoml deploy .

After deployment, the application becomes available through a dedicated URL on BentoCloud. For those interested in private deployment, BentoML allows the generation of OCI-compliant images suitable for custom infrastructure setup.

Exploring Other Diffusion Models

BentoDiffusion isn't limited to just one form of model. Users can explore and deploy various other diffusion models found in the repository's subdirectories. Some notable mentions include:

Each of these models offers unique capabilities, allowing users to tailor their AI projects to specific needs and preferences.

Conclusion

In essence, the BentoDiffusion project acts as an accessible gateway for individuals and teams who are keen on implementing cutting-edge diffusion models. By simplifying deployment processes and offering extensive documentation, BentoDiffusion stands as an empowering tool in the AI practitioner’s toolkit.