metadata
license: apache-2.0
pipeline_tag: text-to-image
Controlling Structure and Appearance for SSD-1B
The CtrlXStableDiffusionXLPipeline has been modified, since it had way too much TODO lines. Refiner phase is removed.
Requires 8GB VRAM.
Setup
pip install accelerate diffusers gradio torch safetensors transformers
Inference
python run_ctrlx.py --num_inference_steps 20 --guidance_scale 9.0 --model_offload --structure_image images_ctrlx/horse__point_cloud.jpg --appearance_image images_ctrlx/horse.jpg --prompt "a photo of a horse standing on grass" --structure_prompt "3D point cloud of a horse"
Disclaimer
All code belongs to Jordan Lin, the models weight to Segmind.