Ranni
The project introduces a text-to-image diffusion process using a large language model that enhances semantic comprehension and a diffusion-based model for drawing. Comprising an LLM-based planning component and diffusion model, the system accurately aligns with text prompts in two phases. Listed as a CVPR 2024 oral paper, the package includes model weights such as a LoRA-finetuned LLaMa-2-7B and fully-finetuned SDv2.1. Users can explore image creation interactively through Gradio demos and apply continuous edits for targeted image changes.