Project Overview: HCP-Diffusion
HCP-Diffusion serves as a toolbox designed for the Stable Diffusion models, utilizing the core capabilities of π€ Diffusers. This tool offers a flexible framework for configuring and supporting different components in the model training process. It provides more adaptability compared to other similar tools like webui and sd-scripts.
What makes HCP-Diffusion particularly noteworthy is its support for Colossal-AI, which is known for substantially reducing GPU memory consumption.
A key feature of HCP-Diffusion is its ability to consolidate various training techniques for text-to-image generation and model structures into a single .yaml
configuration file. The toolbox has introduced an enhanced version of DreamArtist with LoRA, called DreamArtist++, which is optimized for generating high-quality images from text inputs with improved efficiency and control.
Notable Features
- Layer-wise LoRA Handling: Supports LoRA operations for Conv2d layers.
- Finer Control Over Layers: Allows precise fine-tuning and model ensemble at the layer level.
- Prompt-tuning Flexibility: Facilitates prompt refinement using multiple words.
- Enhanced DreamArtist Tools: Includes both DreamArtist and its advanced version, DreamArtist++.
- Innovative ARB Technology: Aspect Ratio Bucket with automated clustering.
- Multiple Dataset Support: Capable of integrating multiple data sources.
- Attention Mechanisms: Implements image attention masks and word attention multipliers.
- Optimized Word Usage: Accommodates custom words that may span multiple spaces.
- Extended Input Policies: Provides mechanisms for expanding maximum sentence lengths.
- Integration with Modern Technologies: Uses π€ Accelerate, Colossal-AI, and xFormers.
- Advanced Training Tactics: Includes features like tag shuffle, dropout, and Safetensors support.
- Controlnet Adaptation: Supports training with Controlnet.
- Custom Optimization Strategies: Offers a variety of custom optimizers and learning rate schedulers.
- Support for SDXL models.
Installation Guide
Via pip
You can install HCP-Diffusion directly using pip:
pip install hcpdiff
# Initialize a new project
hcpinit
From Source
For those preferring from-source installation:
git clone https://github.com/7eu7d7/HCP-Diffusion.git
cd HCP-Diffusion
pip install -e .
# Initialize or modify project
## hcpinit
To optimize for VRAM and speed, install xFormers:
# with conda
conda install xformers -c xformers
# or pip
pip install xformers>=0.0.17
User Instructions
Training
Training can be carried out using scripts compatible with π€ Accelerate or Colossal-AI. Ensure environment configuration for π€ Accelerate is done prior.
- For a comprehensive setup via π€ Accelerate:
accelerate launch -m hcpdiff.train_ac --cfg cfgs/train/cfg_file.yaml
- For single-GPU work:
accelerate launch -m hcpdiff.train_ac_single --cfg cfgs/train/cfg_file.yaml
- Using Colossal-AI with torchrun:
torchrun --nproc_per_node 1 -m hcpdiff.train_colo --cfg cfgs/train/cfg_file.yaml
Inference
Inference involves generating images from text prompts:
python -m hcpdiff.visualizer --cfg cfgs/infer/cfg.yaml pretrained_model=pretrained_model_path \
prompt='positive_prompt' \
neg_prompt='negative_prompt' \
seed=42
Converting Stable Diffusion Models
To use this toolbox, conversion of Stable Diffusion models to the supported format is necessary.
- Conversion can be performed using scripts from π€ Diffusers.
Example for transforming models:
python -m hcpdiff.tools.sd2diffusers \
--checkpoint_path "path_to_stable_diffusion_model" \
--original_config_file "path_to_config_file" \
--dump_path "save_directory" \
[--extract_ema]
[--from_safetensors]
[--to_safetensors]
Tutorials and Guidance
Detailed guides are provided to facilitate learning:
- Training Models Tutorial
- DreamArtist++ Usage Tutorial
- Model Inference Guide
- Configuration File Insights
- webui Model Conversion Guide
Contribution Invitation
The team encourages contributions to expand its features and models.
Maintainers
The project is under the stewardship of HCP-Lab at SYSU, noted for their proficient handling of such sophisticated tools.
Researchers and developers may refer to the citation provided for academic or professional use.