metadata

license: apache-2.0
pipeline_tag: text-to-image

Controlling Structure and Appearance for SSD-1B

The CtrlXStableDiffusionXLPipeline has been modified as it had way too many TODO lines. Removed the refiner phase.

Requires 8GB VRAM.

Setup

pip install accelerate diffusers gradio torch safetensors transformers

Inference

python run_ctrlx.py --num_inference_steps 20 --guidance_scale 9.0 --model_offload --structure_image images/horse__point_cloud.jpg --appearance_image images/horse.jpg --prompt "a photo of a horse standing on grass" --structure_prompt "3D point cloud of a horse"

Disclaimer

All code belongs to Jordan Lin, the models weight to Segmind.