Awesome-Chinese-Stable-Diffusion - Extensive Resource Hub for Chinese Text-to-Image Stable Diffusion Models

Introduction to Awesome-Chinese-Stable-Diffusion

Overview

The Awesome-Chinese-Stable-Diffusion project is a curated repository dedicated to collecting and organizing various resources related to the Stable Diffusion framework within the Chinese context. This encompasses open-source models, applications, datasets, and tutorials that focus on Chinese language and culture. The project's primary aim is to support the development of Chinese-specific models and algorithms.

The creators of this project express gratitude towards contributors and encourage sharing new model repositories and related resources by submitting pull requests. They request that any contributions align with the project's format by providing essential details like repository links, star counts, and brief descriptions.

Chinese Text-to-Image Models

Open Source Models

SkyPaint
- Repository: SkyPaint-AI-Diffusion
- Description: SkyPaint is designed for generating images from text prompts. It comprises two main components: a text prompt encoder based on OpenAI's CLIP, optimized for Chinese-English recognition, and a diffusion model that can produce high-quality modern art images.
Pai-Diffusion
- Repository: EasyNLP
- Description: Developed by Alibaba's PAI team, PAI-Diffusion addresses the limitations of existing models trained on English data, resulting in models that better capture unique Chinese cultural and linguistic elements. It's tailored for various specific scenes such as ancient poetry, anime, and more.
Chinese Stable Diffusion - General Field
- Repository: Model Overview
- Description: This model adapts the original Stable Diffusion framework to Chinese contexts. It replaces the English OpenCLIP-ViT/H encoder with a Chinese CLIP text encoder, trained on extensive Chinese text-image pairs.
Text-to-Image Diffusion Model - Bilingual - Tiny
- Repository: Model Overview
- Description: This model utilizes a StructBert for text feature extraction and a U-Net for diffusion denoising, trained on datasets like LAION400M to generate images that match textual descriptions in both Chinese and English.
Tongyi - Large Bilingual Model
- Repository: Model Overview
- Description: Tongyi's model uses a multi-stage diffusion process to convert text into detailed 2D images, involving large networks trained on diverse datasets for improved convergence and quality.
Taiyi
- Repository: Fengshenbang-LM
- Description: Taiyi involves a CLIP model for bilingual representation with enhancements specific to Chinese concepts, optimized for both language encoding and visual generation.
Taiyi-xl-3.5B
- Repository: Taiyi-Stable-Diffusion-XL-3.5B
- Description: This advanced model excels in bilingual text-to-image generation, offering enhanced quality and diversity, and is particularly suited for both Chinese and English inputs.
AltDiffusion
- Repository: FlagAI
- Description: AltDiffusion builds on stable-diffusion with AltCLIP for improved multilingual capability, trained with WuDao and LAION datasets utilizing contrastive learning techniques.
VisCPM-Paint
- Repository: VisCPM
- Description: Combining CPM-Bee as a text encoder and UNet for image decoding, VisCPM-Paint supports bilingual text-to-image generation, refined on English LAION 2B data.

Conclusion

In essence, the Awesome-Chinese-Stable-Diffusion project stands as a comprehensive effort to support and promote Chinese-language adaptations of the Stable Diffusion framework. By gathering a broad array of models and tools, it seeks to empower developers and researchers focusing on Chinese contexts with robust and tailored resources, enhancing the capabilities and applications of generative models across linguistic and cultural boundaries.