Is it possible to run it with 24GB VRAM?
When I try it on my legacy GPU with 24GB vram, it reports :RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
error.
When I try it on my legacy GPU with 24GB vram, it reports :RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling
cublasCreate(handle)
error.
Yes. It works for me on Nvidia 3090 / 24GB VRAM
Yes. It works for me on Nvidia 3090 / 24GB VRAM
Which controlnet do you use for the testing? it's wired my GPU is tesla P40, pretty old.
Yes. It works for me on Nvidia 3090 / 24GB VRAM
Which controlnet do you use for the testing? it's wired my GPU is tesla P40, pretty old.
I have checked all three models that are presented here. Maybe you need to increase the amount of RAM? I noticed that it also has an effect. It is advisable to have at least 32 GB.
I have checked all three models that are presented here. Maybe you need to increase the amount of RAM? I noticed that it also has an effect. It is advisable to have at least 32 GB.
Thanks for the suggestions, my VM has 64GB memory. I check the process when all the models are loaded, it already consume almost all the 24GB VRAM see the follow pictures.
#>python sd3_infer.py --model models/sd3.5_large.safetensors --controlnet_ckpt models/sd3.5_large_controlnet_canny.safetensors --controlnet_cond_image inputs/cannysmall.png --prompt "An adorable fluffy pastel creature"
Loading tokenizers...
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Loading Google T5-v1-XXL...
Skipping key 'shared.weight' in safetensors file as 'shared' does not exist in python model
Loading OpenAI CLIP L...
Loading OpenCLIP bigG...
Loading SD3 model sd3.5_large.safetensors...
Skipping key 'context_embedder.bias' in safetensors file as 'context_embedder' does not exist in python model
Skipping key 'context_embedder.weight' in safetensors file as 'context_embedder' does not exist in python model
Loading VAE model...
Models loaded.
0%| | 0/60 [00:00<?, ?it/s]
0%| | 0/1 [00:10<?, ?it/s]
Traceback (most recent call last):
File "/home/ubuntu/sd3.5/sd3_infer.py", line 654, in
fire.Fire(main)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_infer.py", line 636, in main
inferencer.gen_image(
File "/home/ubuntu/sd3.5/sd3_infer.py", line 483, in gen_image
sampled_latent = self.do_sampling(
File "/home/ubuntu/sd3.5/sd3_infer.py", line 376, in do_sampling
latent = sample_fn(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_impls.py", line 343, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/sd3_impls.py", line 206, in forward
batched = self.model.apply_model(
File "/home/ubuntu/sd3.5/sd3_impls.py", line 169, in apply_model
controlnet_hidden_states = self.control_model(
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/dit_embedder.py", line 82, in forward
c = self.t_embedder(timestep, dtype=x.dtype)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/sd3.5/mmditx.py", line 186, in forward
t_emb = self.mlp(t_freq)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward
input = module(input)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sd35/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 125, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)