Introduction to B-LoRA: Implicit Style-Content Separation
B-LoRA is an innovative project designed to separate the style and content of an image implicitly, facilitating a wide range of image stylization applications. By utilizing the powerful techniques of Stable Diffusion XL (SDXL) and Low-Rank Adaptation (LoRA), B-LoRA provides users with tools for effortless image style transfer, text-based stylization, and consistent style generation. This document serves as a comprehensive guide to understanding and using the B-LoRA project.
Project Highlights
B-LoRA's innovative method allows users to extract style and content from images, enabling several creative tasks such as:
- Image Style Transfer: Users can transfer the style from one image onto the content of another.
- Text-Based Stylization: B-LoRA supports creating stylized images based on textual descriptions.
- Consistent Style Generation: The project allows the creation of images that maintain a consistent style across different content.
Important Update
A crucial update on May 21, 2024, addressed issues with newer versions of certain software (diffusers and PEFT) that affected the training process of B-LoRA. Users are recommended to use an earlier version of the diffuser (version 0.25.0) for optimal performance.
Getting Started
To get started with B-LoRA, users need to follow these steps:
Prerequisites
Before installation, ensure that you have:
- Python version 3.11.6 or higher.
- PyTorch version 2.1.1 or higher.
Additional dependencies are listed in the project's requirements.txt
file.
Installation
-
Clone the Repository:
Open your terminal and run the following commands to clone the repository from GitHub and navigate into it:git clone https://github.com/yardenfren1996/B-LoRA.git cd B-LoRA
-
Install Dependencies:
Install the required software packages using pip:pip install -r requirements.txt
Usage
Training B-LoRA
To train B-LoRAs, use the script train_dreambooth_b-lora_sdxl.py
as follows:
accelerate launch train_dreambooth_b-lora_sdxl.py \
--pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
--instance_data_dir="<path/to/example_images>" \
--output_dir="<path/to/output_dir>" \
--instance_prompt="<prompt>" \
--resolution=1024 \
--rank=64 \
--train_batch_size=1 \
--learning_rate=5e-5 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=1000 \
--checkpointing_steps=500 \
--seed="0" \
--gradient_checkpointing \
--use_8bit_adam \
--mixed_precision="fp16"
Ensure to replace placeholders like instance_data_dir
, output_dir
, and instance_prompt
with actual paths and prompts you wish to use.
Inference for Image Stylization
B-LoRA supports various stylization methods. Here's how to utilize them:
-
Image Stylization by Reference:
python inference.py --prompt="A <c> in <s> style" --content_B_LoRA="<path/to/content_B-LoRA>" --style_B_LoRA="<path/to/style_B-LoRA>" --output_path="<path/to/output_dir>"
-
Text-Based Image Stylization:
python inference.py --prompt="A <c> made of gold" --content_B_LoRA="<path/to/content_B-LoRA>" --output_path="<path/to/output_dir>"
-
Consistent Style Generation:
python inference.py --prompt="A backpack in <s> style" --style_B_LoRA="<path/to/style_B-LoRA>" --output_path="<path/to/output_dir>"
Adjust content_alpha
, style_alpha
, and num_images_per_prompt
in inference.py
to refine outputs.
Citation
Researchers using B-LoRA in their work are encouraged to cite the project's associated academic paper:
@misc{frenkel2024implicit,
title={Implicit Style-Content Separation using B-LoRA},
author={Yarden Frenkel and Yael Vinker and Ariel Shamir and Daniel Cohen-Or},
year={2024},
eprint={2403.14572},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
License and Contact
B-LoRA is licensed under the MIT License. For questions or suggestions, reach out via [email protected].
This overview provides a detailed introduction to the B-LoRA project, outlining its capabilities, setup, and usage, inviting exploration and creative application in image stylization tasks.