OOTDiffusion - Use OOTDiffusion for Latent Diffusion Virtual Try-On

OOTDiffusion: A Revolutionary Virtual Try-On System

OOTDiffusion is an innovative project that offers a cutting-edge solution for virtual try-ons. This project is designed to leverage advanced technology, allowing users to experience trying on clothes virtually in a controllable and realistic manner. The project is officially implemented and available on the popular platform Hugging Face, with resources such as model checkpoints and a demo for users to explore the technology.

What is OOTDiffusion?

OOTDiffusion stands for "Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on." It is a sophisticated model developed by researchers Yuhao Xu, Tao Gu, Weifeng Chen, and Chengcai Chen at Xiao-i Research. This model is particularly adept at creating realistic digital simulations of outfits on a virtual model, utilizing advanced diffusion techniques to blend clothing and models seamlessly.

Core Features

Model Checkpoints: The project offers pre-trained model checkpoints on datasets like VITON-HD for half-body simulations and Dress Code for full-body simulations. These checkpoints are essential components, serving as the foundational models for virtual try-on processes.
Hugging Face Integration: Users can access the OOTDiffusion models directly through Hugging Face, which provides a robust platform for deploying and managing machine learning models.
ONNX Support: The project includes ONNX support for human parsing tasks, addressing a wide array of environmental issues and enhancing compatibility.
Compatibility: The system has been tested on Linux (specifically Ubuntu 22.04), ensuring stability and performance.

Installation Process

To start using OOTDiffusion, users need to follow a straightforward installation process:

Clone the Repository: Users should clone the OOTDiffusion repository from GitHub.
```
git clone https://github.com/levihsu/OOTDiffusion
```

Environment Setup: Create a conda environment specifically for OOTDiffusion and install the necessary packages.

conda create -n ootd python==3.10
conda activate ootd
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
pip install -r requirements.txt

Performing Inference

OOTDiffusion supports inference for both half-body and full-body models:

Half-body Model: Run the inference using the command below, replacing placeholders with actual paths.

cd OOTDiffusion/run
python run_ootd.py --model_path <model-image-path> --cloth_path <cloth-image-path> --scale 2.0 --sample 4

Full-body Model: This requires specifying the category of the garment (0 for upper body, 1 for lower body, and 2 for dresses).

cd OOTDiffusion/run
python run_ootd.py --model_path <model-image-path> --cloth_path <cloth-image-path> --model_type dc --category 2 --scale 2.0 --sample 4

Contribution and Future Developments

The team behind OOTDiffusion has laid out a clear roadmap with tasks completed like the development of their paper, the Gradio demo, inference code, and model weights. While training code remains in development, the project is open for collaboration and improvement.

Citing OOTDiffusion

For academic and research purposes, OOTDiffusion can be cited using the following reference:

@article{xu2024ootdiffusion,
  title={OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on},
  author={Xu, Yuhao and Gu, Tao and Chen, Weifeng and Chen, Chengcai},
  journal={arXiv preprint arXiv:2403.01779},
  year={2024}
}

Conclusion

OOTDiffusion is setting a new standard in the realm of virtual try-on technologies. With its robust models, seamless user interface, and future-ready development, it undeniably paves the way for more interactive and realistic digital fashion experiences.