---
license: apache-2.0
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- controlnet
inference: false
language:
- en
pipeline_tag: text-to-image
---

# Softedge ControlNet
EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).<br/>
The controlnet weights are fine-tuned based on stable-diffusion-xl-base-1.0.
It works good on SDXL as well as community models based on SDXL.
The model is trained on general data and taobao e-commerce data, and has good capabilities in both general and e-commerce scenarios.

## Examples
These cases are generated using AUTOMATIC1111/stable-diffusion-webui.


`softedge`|`weight-0.6`|`weight-0.8`
:--:|:--:|:--:
![images)](./images/1_0.png) | ![images)](./images/1_1.png) | ![images)](./images/1_2.png)
![images)](./images/2_0.png) | ![images)](./images/2_1.png) | ![images)](./images/2_2.png)
![images)](./images/3_0.png) | ![images)](./images/3_1.png) | ![images)](./images/3_2.png)
![images)](./images/4_0.png) | ![images)](./images/4_1.png) | ![images)](./images/4_2.png)


## Usage with Diffusers
```python
from diffusers import (
    ControlNetModel,
    StableDiffusionXLControlNetPipeline,
    DPMSolverMultistepScheduler,
    AutoencoderKL
)
from diffusers.utils import load_image
from controlnet_aux import PidiNetDetector, HEDdetector
import torch
from PIL import Image

controlnet = ControlNetModel.from_pretrained(
    "alimama-creative/EcomXL_controlnet_softedge", torch_dtype=torch.float16, use_safetensors=True
)
vae = AutoencoderKL.from_pretrained('madebyollin/sdxl-vae-fp16-fix', torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", 
    controlnet=controlnet, 
    vae=vae,
    torch_dtype=torch.float16
)

pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# pipe.enable_xformers_memory_efficient_attention()
pipe.to(device="cuda", dtype=torch.float16)
pipe.enable_vae_slicing()


image = load_image(
    "https://huggingface.co/alimama-creative/EcomXL_controlnet_softedge/resolve/main/images/1_1.png"
)
edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
edge_image = edge_processor(image, safe=False) # set True to use pidisafe

prompt="a bottle on the Twilight Grassland, Sitting on the ground, a couple of tall grass sitting in a field of tall grass, sunset,"
negative_prompt = "low quality, bad quality, sketches"

output = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    image=edge_image, 
    num_inference_steps=25,
    controlnet_conditioning_scale=0.6,
    guidance_scale=7,
    width=1024,
    height=1024,
).images[0]

output.save(f'test_edge.png')

```
The model exhibits good performance when the controlnet weight (controlnet_condition_scale) is within the range of 0.6 to 0.8.

## Training details

Mixed precision: FP16<br/>
Learning rate: 1e-5<br/>
batch size: 1024<br/>
Noise offset: 0.05<br/>
The model is trained for 37k steps.
The training data includes 12M laion2B and internal sources images with aesthetic 6 plus, as well as 3M Taobao e-commerce images. The softedge preproessor during training is randomly selected from pidinet, hed, pidisafe and hedsafe, which are officially supported by Automatic&&Mikubill. The model has good performance when the weight is in 0.6~0.8.