metadata

license: apache-2.0
pipeline_tag: text-to-image

Controlling Structure and Appearance for SSD-1B

The CtrlXStableDiffusionXLPipeline has been modified, since it had way too much TODO lines. Refiner phase is removed.

Requires 8GB VRAM.

Setup

pip install accelerate diffusers gradio torch safetensors transformers

Inference

python run_ctrlx.py --num_inference_steps 20 --guidance_scale 9.0 --model_offload --structure_image images_ctrlx/horse__point_cloud.jpg --appearance_image images_ctrlx/horse.jpg --prompt "a photo of a horse standing on grass" --structure_prompt "3D point cloud of a horse"

Disclaimer

All code belongs to Jordan Lin, the models weight to Segmind.