Diffusers 🧨 port of InstanceDiffusion: Instance-level Control for Image Generation (CVPR 2024)

Checkpoint

Install

StableDiffusionINSTDIFFPipeline is yet merged into diffusers. Please refer to the forked version.

git clone -b instancediffusion https://github.com/gokyeongryeol/diffusers.git
cd diffusers & pip install -e .

Example Usage

import torch
from diffusers import StableDiffusionINSTDIFFPipeline

pipe = StableDiffusionINSTDIFFPipeline.from_pretrained(
    "kyeongry/instancediffusion_sd15",
    # variant="fp16", torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")

prompt = "a yellow American robin, brown Maltipoo dog, a gray British Shorthair in a stream, alongside with trees and rocks"
negative_prompt = "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality"

# normalized (xmin,ymin,xmax,ymax)
boxes = [
    [0.0, 0.099609375, 0.349609375, 0.548828125],
    [0.349609375, 0.19921875, 0.6484375, 0.498046875],
    [0.6484375, 0.19921875, 0.998046875, 0.697265625],
    [0.0, 0.69921875, 1.0, 0.998046875],
]
phrases = [
    "a gray British Shorthair standing on a rock in the woods",
    "a yellow American robin standing on the rock",
    "a brown Maltipoo dog standing on the rock",
    "a close up of a small waterfall in the woods",
]     

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    instdiff_phrases=phrases,
    instdiff_boxes=boxes,
    instdiff_scheduled_sampling_alpha=0.8,  # proportion of using gated-self-attention
    instdiff_scheduled_sampling_beta=0.36,  # proportion of using multi-instance sampler
    guidance_scale=7.5,
    output_type="pil",
    num_inference_steps=50,
).images[0]

image.save("./instancediffusion-sd15-layout2image-generation.jpg")

Sample Output

image/jpeg

Downloads last month
227
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.