stabilityai/stable-diffusion-3.5-controlnets · Is it possible to run it with 24GB VRAM?

28 days ago

When I try it on my legacy GPU with 24GB vram, it reports :RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle) error.

nakidworld

28 days ago

When I try it on my legacy GPU with 24GB vram, it reports :RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle) error.

Yes. It works for me on Nvidia 3090 / 24GB VRAM

yixingjia

28 days ago

•

edited 28 days ago

Yes. It works for me on Nvidia 3090 / 24GB VRAM

Which controlnet do you use for the testing? it's wired my GPU is tesla P40, pretty old.

nakidworld

27 days ago

Yes. It works for me on Nvidia 3090 / 24GB VRAM
Which controlnet do you use for the testing? it's wired my GPU is tesla P40, pretty old.

I have checked all three models that are presented here. Maybe you need to increase the amount of RAM? I noticed that it also has an effect. It is advisable to have at least 32 GB.

yixingjia

27 days ago

I have checked all three models that are presented here. Maybe you need to increase the amount of RAM? I noticed that it also has an effect. It is advisable to have at least 32 GB.

Thanks for the suggestions, my VM has 64GB memory. I check the process when all the models are loaded, it already consume almost all the 24GB VRAM see the follow pictures.

#>python sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_canny.safetensors --controlnet_cond_image inputs/cannysmall.png --prompt "An adorable fluffy pastel creature"
Loading tokenizers...
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Loading Google T5-v1-XXL...
Skipping key 'shared.weight' in safetensors file as 'shared' does not exist in python model
Loading OpenAI CLIP L...
Loading OpenCLIP bigG...
Loading SD3 model sd3.5_large.safetensors...
Skipping key 'context_embedder.bias' in safetensors file as 'context_embedder' does not exist in python model
Skipping key 'context_embedder.weight' in safetensors file as 'context_embedder' does not exist in python model
Loading VAE model...
Models loaded.
0%| | 0/60 [00:00<?, ?it/s]
0%| | 0/1 [00:10<?, ?it/s]
Traceback (most recent call last):
File "/home/ubuntu/sd3.5/sd3_infer.py", line 654, in
fire.Fire(main)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_infer.py", line 636, in main
inferencer.gen_image(
File "/home/ubuntu/sd3.5/sd3_infer.py", line 483, in gen_image
sampled_latent = self.do_sampling(
File "/home/ubuntu/sd3.5/sd3_infer.py", line 376, in do_sampling
latent = sample_fn(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_impls.py", line 343, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_impls.py", line 206, in forward
batched = self.model.apply_model(
File "/home/ubuntu/sd3.5/sd3_impls.py", line 169, in apply_model
controlnet_hidden_states = self.control_model(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/dit_embedder.py", line 82, in forward
c = self.t_embedder(timestep, dtype=x.dtype)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/mmditx.py", line 186, in forward
t_emb = self.mlp(t_freq)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward
input = module(input)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 125, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)