This is a model compiled for optimized inference on AWS inf2 instances. The overall performance can achieve 4it/s for a 1024x1024 picture.
NOTE: In order to load and infer on inf2.xlarge instances, a minimum of 8 GB SWAP memory is required. Other instance sizes do not require this.
Assuming you're using the official DLAMI or installed the drivers and libraries required for neuron devices, install the latest optimum_neuronx library:
pip install -U optimum[neuronx] diffusers=0.20.0
Then, enjoy the super-fast experience:
from optimum.neuron import NeuronStableDiffusionXLPipeline
pipe = NeuronStableDiffusionXLPipeline.from_pretrained("paulkm/sdxl_neuron_pipe", device_ids=[0,1])
img = pipe("a cute black cat").images[0]
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support text-to-image models for optimum_neuronx library.