README.md · Kwai-Kolors/Kolors-Inpainting at 3bb98b424c9aa02d11164bf78904a3e7fba17c09

metadata

license: apache-2.0
language:
  - zh
  - en
tags:
  - text-to-image
  - stable-diffusion
  - kolors

📖 Introduction

We provide Kolors-Inpainting inference code and weights which were initialized with Kolors-Basemodel. Examples of Kolors-Inpainting results are as follows:

Model details

For inpainting, the UNet has 5 additional input channels (4 for the encoded masked image and 1 for the mask itself). The weights for the encoded masked-image channels were initialized from the non-inpainting checkpoint, while the weights for the mask channel were zero-initialized.
To improve the robustness of the inpainting model, we adopt a more diverse strategy for generating masks, including random masks, subject segmentation masks, rectangular masks, and masks based on dilation operations.

📊 Evaluation

For evaluation, we created a test set comprising 200 masked images and text prompts. We invited several image experts to provide unbiased ratings for the generated results of different models. The experts assessed the generated images based on four criteria: visual appeal, text faithfulness, inpainting artifacts, and overall satisfaction. Inpainting artifacts measure the perceptual boundaries in the inpainting results, while the other criteria adhere to the evaluation standards of the BaseModel. The specific results are summarized in the table below, where Kolors-Inpainting achieved the highest overall satisfaction score.

Model	Average Overall Satisfaction	Average Inpainting Artifacts	Average Visual Appeal	Average Text Faithfulness
SDXL-Inpainting	2.573	1.205	3.000	4.299
Kolors-Inpainting	3.493	0.204	3.855	4.346

The higher the scores for Average Overall Satisfaction, Average Visual Appeal, and Average Text Faithfulness, the better. Conversely, the lower the score for Average Inpainting Artifacts, the better.

The comparison results of SDXL-Inpainting and Kolors-Inpainting are as follows:

Kolors-Inpainting employs Chinese prompts, while SDXL-Inpainting uses English prompts.

🛠️ Usage

Requirements

The dependencies and installation are basically the same as the Kolors-BaseModel.

Repository Cloning and Dependency Installation

apt-get install git-lfs
git clone https://github.com/Kwai-Kolors/Kolors
cd Kolors
conda create --name kolors python=3.8
conda activate kolors
pip install -r requirements.txt
python3 setup.py install

Weights download link：

huggingface-cli download --resume-download Kwai-Kolors/Kolors-Inpainting --local-dir weights/Kolors-Inpainting

Inference：

python3 inpainting/sample_inpainting.py ./inpainting/asset/3.png ./inpainting/asset/3_mask.png 穿着美少女战士的衣服，一件类似于水手服风格的衣服，包括一个白色紧身上衣，前胸搭配一个大大的红色蝴蝶结。衣服的领子部分呈蓝色，并且有白色条纹。她还穿着一条蓝色百褶裙，超高清，辛烷渲染，高级质感，32k，高分辨率，最好的质量，超级细节，景深

python3 inpainting/sample_inpainting.py ./inpainting/asset/4.png ./inpainting/asset/4_mask.png 穿着钢铁侠的衣服，高科技盔甲，主要颜色为红色和金色，并且有一些银色装饰。胸前有一个亮起的圆形反应堆装置，充满了未来科技感。超清晰，高质量，超逼真，高分辨率，最好的质量，超级细节，景深

# The image will be saved to "scripts/outputs/"