--- license: apache-2.0 base_model: stabilityai/stable-diffusion-xl-base-1.0 tags: - stable-diffusion-xl - stable-diffusion-xl-diffusers - text-to-image - diffusers - controlnet inference: false language: - en pipeline_tag: text-to-image --- # Softedge ControlNet EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).
The controlnet weights are fine-tuned based on stable-diffusion-xl-base-1.0. It works good on SDXL as well as community models based on SDXL. The model is trained on general data and taobao e-commerce data, and has good capabilities in both general and e-commerce scenarios. ## Examples These cases are generated using AUTOMATIC1111/stable-diffusion-webui. `softedge`|`weight-0.6`|`weight-0.8` :--:|:--:|:--: ![images)](./images/1_0.png) | ![images)](./images/1_1.png) | ![images)](./images/1_2.png) ![images)](./images/2_0.png) | ![images)](./images/2_1.png) | ![images)](./images/2_2.png) ![images)](./images/3_0.png) | ![images)](./images/3_1.png) | ![images)](./images/3_2.png) ![images)](./images/4_0.png) | ![images)](./images/4_1.png) | ![images)](./images/4_2.png) ## Usage with Diffusers ```python from diffusers import ( ControlNetModel, StableDiffusionXLControlNetPipeline, DPMSolverMultistepScheduler, AutoencoderKL ) from diffusers.utils import load_image from controlnet_aux import PidiNetDetector, HEDdetector import torch from PIL import Image controlnet = ControlNetModel.from_pretrained( "alimama-creative/EcomXL_controlnet_softedge", torch_dtype=torch.float16, use_safetensors=True ) vae = AutoencoderKL.from_pretrained('madebyollin/sdxl-vae-fp16-fix', torch_dtype=torch.float16) pipe = StableDiffusionXLControlNetPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16 ) pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) # pipe.enable_xformers_memory_efficient_attention() pipe.to(device="cuda", dtype=torch.float16) pipe.enable_vae_slicing() image = load_image( "https://huggingface.co/alimama-creative/EcomXL_controlnet_softedge/resolve/main/images/1_1.png" ) edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators') edge_image = edge_processor(image, safe=False) # set True to use pidisafe prompt="a bottle on the Twilight Grassland, Sitting on the ground, a couple of tall grass sitting in a field of tall grass, sunset," negative_prompt = "low quality, bad quality, sketches" output = pipe( prompt, negative_prompt=negative_prompt, image=edge_image, num_inference_steps=25, controlnet_conditioning_scale=0.6, guidance_scale=7, width=1024, height=1024, ).images[0] output.save(f'test_edge.png') ``` The model exhibits good performance when the controlnet weight (controlnet_condition_scale) is within the range of 0.6 to 0.8. ## Training details Mixed precision: FP16
Learning rate: 1e-5
batch size: 1024
Noise offset: 0.05
The model is trained for 37k steps. The training data includes 12M laion2B and internal sources images with aesthetic 6 plus, as well as 3M Taobao e-commerce images. The softedge preproessor during training is randomly selected from pidinet, hed, pidisafe and hedsafe, which are officially supported by Automatic&&Mikubill. The model has good performance when the weight is in 0.6~0.8.