OOTDiffusion: A Revolutionary Virtual Try-On System
OOTDiffusion is an innovative project that offers a cutting-edge solution for virtual try-ons. This project is designed to leverage advanced technology, allowing users to experience trying on clothes virtually in a controllable and realistic manner. The project is officially implemented and available on the popular platform Hugging Face, with resources such as model checkpoints and a demo for users to explore the technology.
What is OOTDiffusion?
OOTDiffusion stands for "Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on." It is a sophisticated model developed by researchers Yuhao Xu, Tao Gu, Weifeng Chen, and Chengcai Chen at Xiao-i Research. This model is particularly adept at creating realistic digital simulations of outfits on a virtual model, utilizing advanced diffusion techniques to blend clothing and models seamlessly.
Core Features
-
Model Checkpoints: The project offers pre-trained model checkpoints on datasets like VITON-HD for half-body simulations and Dress Code for full-body simulations. These checkpoints are essential components, serving as the foundational models for virtual try-on processes.
-
Hugging Face Integration: Users can access the OOTDiffusion models directly through Hugging Face, which provides a robust platform for deploying and managing machine learning models.
-
ONNX Support: The project includes ONNX support for human parsing tasks, addressing a wide array of environmental issues and enhancing compatibility.
-
Compatibility: The system has been tested on Linux (specifically Ubuntu 22.04), ensuring stability and performance.
Installation Process
To start using OOTDiffusion, users need to follow a straightforward installation process:
-
Clone the Repository: Users should clone the OOTDiffusion repository from GitHub.
git clone https://github.com/levihsu/OOTDiffusion
-
Environment Setup: Create a conda environment specifically for OOTDiffusion and install the necessary packages.
conda create -n ootd python==3.10 conda activate ootd pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pip install -r requirements.txt
Performing Inference
OOTDiffusion supports inference for both half-body and full-body models:
-
Half-body Model: Run the inference using the command below, replacing placeholders with actual paths.
cd OOTDiffusion/run python run_ootd.py --model_path <model-image-path> --cloth_path <cloth-image-path> --scale 2.0 --sample 4
-
Full-body Model: This requires specifying the category of the garment (0 for upper body, 1 for lower body, and 2 for dresses).
cd OOTDiffusion/run python run_ootd.py --model_path <model-image-path> --cloth_path <cloth-image-path> --model_type dc --category 2 --scale 2.0 --sample 4
Contribution and Future Developments
The team behind OOTDiffusion has laid out a clear roadmap with tasks completed like the development of their paper, the Gradio demo, inference code, and model weights. While training code remains in development, the project is open for collaboration and improvement.
Citing OOTDiffusion
For academic and research purposes, OOTDiffusion can be cited using the following reference:
@article{xu2024ootdiffusion,
title={OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on},
author={Xu, Yuhao and Gu, Tao and Chen, Weifeng and Chen, Chengcai},
journal={arXiv preprint arXiv:2403.01779},
year={2024}
}
Conclusion
OOTDiffusion is setting a new standard in the realm of virtual try-on technologies. With its robust models, seamless user interface, and future-ready development, it undeniably paves the way for more interactive and realistic digital fashion experiences.