#Image Synthesis
neural-doodle
The project implements deep neural network algorithms like Semantic Style Transfer and Neural Patches to transform basic doodles into artworks by transferring artistic styles from renowned paintings. Using the 'doodle.py' script, it allows style transfer, image analogy, and texture synthesis through multiple input images and customizable settings. Compatible with both CPU and GPU, it enhances rendering speed with GPU use. Installation is available through Docker or manual methods, enabling users to refine images into near-photorealistic quality with practice and parameter tuning.
ai4artists
Discover resources blending AI and art, with tools, tutorials, and insights from creative individuals. Ideal for enthusiasts in deep learning, creative coding, and generative art. Access guides on courses from institutions like MIT, delve into diffusion models and neural radiance fields, and explore curated books and videos. Engage with leading artists and institutions, offering a comprehensive resource for integrating AI in creative endeavors.
ReVersion
Explore Relation Inversion in image processing, utilizing diffusion techniques to capture and synthesize relations in images. ReVersion provides tools for generating relation-specific images across varied contexts, with optimized code and integration with platforms such as Hugging Face. Learn about updates and benchmarks for enhanced accessibility.
LFM
Discover a framework using flow matching in latent spaces of autoencoders to improve efficiency and scalability in image synthesis. This method addresses computational issues in diffusion models, supporting efficient training with limited resources. Validated on datasets like CelebA-HQ and ImageNet, it provides insight through the Wasserstein-2 distance between latent and data distributions.
latent-consistency-model
Latent Consistency Models efficiently create high-resolution images with fewer computational steps. The project includes the LCM-LoRA for enhanced diffusion and faster image generation, compatible with platforms like Hugging Face and OpenXLab. Latest updates include the Pixart-α X LCM model and C# plus ONNX Runtime support, offering real-time demos usable on multiple operating systems. The community is encouraged to contribute and engage via dedicated channels.
edm2
The official PyTorch code for the CVPR 2024 paper presents improvements in training dynamics of diffusion models for image synthesis. By addressing inefficiencies in the ADM diffusion model, the paper suggests network redesigns to maintain activation and weight balance without changing the overall structure. These optimizations improve FID scores from 2.41 to 1.81 on ImageNet-512, using deterministic sampling. A new method for post-training EMA parameter tuning is also introduced, enabling precise adjustments without extra training runs.
MasaCtrl
MasaCtrl's tuning-free self-attention control revolutionizes non-rigid image synthesis and editing by integrating source image content with synthesized layouts through prompts and additional controls. This method ensures consistent modifications without fine-tuning, and seamlessly integrates with controllable diffusion models such as T2I-Adapter and ControlNet. Versatile in nature, it supports video synthesis and adapts well to various Stable Diffusion models, including Anything-V4. Designed for use with Python 3.8.5 and Pytorch 1.11, MasaCtrl effortlessly incorporates into existing workflows to deliver reliable results.
Awesome-Chinese-Stable-Diffusion
This initiative organizes open-source models, applications, datasets, and tutorials focused on Chinese language Stable Diffusion (SD). It features unique data and algorithms for developing SD models with Chinese capabilities. The project encourages contributions via PRs, detailing repository links and star counts. Key models include SkyPaint for text-image synthesis, PAI-Diffusion for domain-specific images, and Taiyi-Diffusion-XL for bilingual functionalities, making it vital for developers and researchers involved in Chinese Stable Diffusion integration.
DMD2
Discover how advanced techniques improve Distribution Matching Distillation (DMD) by eliminating regression loss and integrating GAN loss for faster image synthesis. This approach enhances training stability and efficiency through multi-step sampling, achieving notable FID scores of 1.28 on ImageNet-64x64 and 8.35 on COCO 2014. The improved method reduces inference costs and supports fast generation of high-quality megapixel images.
scepter
SCEPTER is an open-source repository focused on generative training and inference, providing tools for image generation, transfer, and editing. It incorporates community approaches and Alibaba Tongyi Lab's proprietary methods, making it pivotal for AI-generated content research. Key features include a generative training framework, ease of implementing popular methods, and the SCEPTER Studio for interactive use. Recent updates add support for the FLUX framework, and introduce models like ACE for varied image editing and SCEdit for controllable synthesis, streamlining innovation in generative model development.
Kolors
Kolors improves text-to-image synthesis using advanced diffusion models, ensuring high visual quality and semantic accuracy in both English and Chinese. Leveraging billions of text-image pairs, it is proficient in detailed and complex designs. Recent updates enable features like virtual try-ons, pose control, and face identification, accessible via Hugging Face and GitHub. Its performance is validated by comprehensive evaluations. The Kolors suite includes user-friendly pipelines for diffusion models, inpainting, and LoRA training, offering a robust solution for photorealistic image generation.
GAN-Inversion
Delve into a comprehensive collection of GAN inversion resources encompassing diverse methods and applications in both 2D and 3D contexts. Published in TPAMI 2022, this survey highlights key academic works and technical implementations, including GAN latent space editing and real-world uses like image generation and facial recognition. Investigate inversion and editing approaches in both conventional GANs and advanced diffusion models, with direct access to related projects, academic papers, and source code.
gigagan-pytorch
This implementation of a State-of-the-Art GAN from Adobe is enhanced for faster convergence and improved stability, leveraging lightweight GAN technologies. It features 1k to 4k upsamplers, skip layer excitation, and auxiliary reconstruction loss in the discriminator for high-resolution image synthesis. The project supports unconditional settings and integrates multi-GPU training via Huggingface's Accelerator, ensuring effective multi-scale input processing and stable training with an efficient gradient penalty application.
UltraPixel
UltraPixel advances high-resolution image synthesis, facilitating the creation of detailed and high-quality images across multiple resolutions. Recent updates improve compatibility with PyTorch and Torchvision, optimizing generation speed on RTX 4090 GPUs. It offers text-guided and personalized image generation through a user-friendly Gradio interface and employs advanced pre-trained models. Memory-efficient techniques support resolutions up to 4K.
MagicClothing
Magic Clothing, a branch of OOTDiffusion, provides garment-driven image synthesis with independent control over clothing and text dynamics, using model weights trained on 768 resolution. The project integrates IP-Adapter-FaceID and ControlNet-Openpose to allow portrait and pose image conditioning. Continuous updates are released on the earlyAccess branch. Discover its full capabilities through demos as Magic Clothing evolves in garment-driven image generation.
Feedback Email: [email protected]