File size: 2,281 Bytes

b4edcfd
 
c7dd6ce
 
 
 
 
 
 
 
 
 
 
b4edcfd
c7dd6ce

---
license: apache-2.0
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- controlnet
inference: false
language:
- en
pipeline_tag: text-to-image
---

# EcomXL Inpaint ControlNet
EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).<br/>
For e-commerce scenarios, we trained Inpaint ControlNet to control diffusion models.
Unlike the inpaint controlnets used for general scenarios, this model is fine-tuned with instance masks to prevent foreground outpainting. 

## Examples

<span style="width: 150px !important;display: inline-block;">`Foreground`<span> | <span style="width: 150px !important;display: inline-block;">`Mask`<span> | <span style="width: 150px !important;display: inline-block;">`w/o instance mask`<span> | <span style="width: 150px !important;display: inline-block;">`w/ instance mask`<span>
:--:|:--:|:--:|:--:
![images)](./images/inp_0.png) | ![images)](./images/inp_1.png) | ![images)](./images/inp_3.png) | ![images)](./images/inp_3.png)
<!-- <img src="https://huggingface.co/alimama-creative/EcomXL/resolve/main/images/inp_0.png" width="300"/> | <img src="https://huggingface.co/alimama-creative/EcomXL/resolve/main/images/inp_1.png" width="300"/> | <img src="https://huggingface.co/alimama-creative/EcomXL/resolve/main/images/inp_2.png" width="300"/> | <img src="https://huggingface.co/alimama-creative/EcomXL/resolve/main/images/inp_3.png" width="300"/> -->

Using this ControlNet with a control weight of 0.5 may achieve better results.

## Usage with Diffusers
```python
from diffusers import ControlNetModel
import torch

controlnet = ControlNetModel.from_pretrained(
    "alimama-creative/EcomXL_controlnet_inpaint", torch_dtype=torch.float16, use_safetensors=True
)
```

## Training details
In the first phase, the model was trained on 12M laion2B and internal source images with random masks for 20k steps. In the second phase, the model was trained on 3M e-commerce images with the instance mask for 20k steps.<br>
Mixed precision: FP16<br>
Learning rate: 1e-4<br>
batch size: 2048<br>
Noise offset: 0.05