stable-diffusion-docker

Introduction to Stable Diffusion Docker

The Stable Diffusion Docker project is an innovative tool designed to streamline the process of running the official Stable Diffusion releases on Hugging Face within a GPU-accelerated Docker container. This setup enables users to explore the vast capabilities of Stable Diffusion models in generating visually captivating images from simple text prompts and existing images.

Getting Started

Minimum System Requirements

The pipeline utilizes the full model and weights by default, necessitating a CUDA-capable GPU with at least 8GB of VRAM for efficient image creation. On less powerful machines, adjustments to these settings may be necessary. Users without a compatible GPU can opt for CPU-only rendering by specifying the --device cpu and --onnx options.

Hugging Face Token

To leverage the official Stable Diffusion model, users must generate a user access token from their Hugging Face account. This token, stored in a file named token.txt, is crucial in building and running the Docker container, facilitating access to the comprehensive model features.

Setting Up

The process is streamlined using a single script named build.sh. Users can pull the latest stable-diffusion-docker version with ./build.sh pull. A valid Hugging Face user token must be specified via the --token option when running the script. Alternatively, the Docker image can be built locally prior to execution.

Building the Docker Image

To build the Docker image, ensure your Hugging Face token is saved in token.txt and execute:

./build.sh build

Running Stable Diffusion

Create an Image from Text (`txt2img`)

Generate a stunning image from a simple text description:

./build.sh run 'Andromeda galaxy in a bottle'

Transform Images with Different Techniques

Image-to-Image (img2img): Convert an existing image with a new text prompt.
```
./build.sh run --image image.png 'Andromeda galaxy in a bottle'
```

Depth-Guided Diffusion (depth2img): Modify an existing image using a depth map.

./build.sh run --model 'stabilityai/stable-diffusion-2-depth' --image image.png 'Detailed changes description'

Instruct Pix2Pix (pix2pix): Alter an image using a text-based prompt.

./build.sh run --model 'timbrooks/instruct-pix2pix' --image image.png 'Detailed object changes'

Stable UnCLIP Variations (unclip): Produce various versions of an image.

./build.sh run --model 'stabilityai/stable-diffusion-2-1-unclip-small' --image image.png 'Detailed image description'

Image Upscaling (upscale4x): Enhance image resolution.

./build.sh run --model 'stabilityai/stable-diffusion-x4-upscaler' --image image.png 'Andromeda galaxy in a bottle'

Diffusion Inpainting (inpaint): Adjust specific areas of an image using masks.

./build.sh run --model 'runwayml/stable-diffusion-inpainting' --image image.png --mask mask.png 'Andromeda galaxy in a bottle'

Customizing Outputs

The project provides several options to fine-tune the outputs, such as adjusting image size, rendering scale, and specifying seed values for repeatability. Users can also manipulate memory usage and image quality through various settings like --half and --attention-slicing.

Example Models and Commands

Multiple popular models are supported, like Stable Diffusion 2.0 and Dreamlike Diffusion. Here's how you might run a command using the OpenJourney model:

./build.sh run --model 'prompthero/openjourney' --prompt 'abstract art'

For systems with limited resources, options to reduce memory usage are available, including decreasing image dimensions and utilizing memory-efficient strategies.

File Outputs and Contributions

Images are saved as PNGs in the output folder, utilizing prompt text for naming. Contributions to the project are welcomed following guidelines in the CONTRIBUTING.md document, emphasizing compliance with project standards and submission through pull requests.

Overall, the Stable Diffusion Docker project opens up a world of artistic possibilities by harnessing deep learning models' power in a manageable, containerized environment.