Multi-GPU inference: RuntimeError: Expected all tensors to be on the same device

#4
by Butzermoggel - opened

Getting: "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:1! (when checking argument for argument tensors in method wrapper_CUDA_cat)" when trying to run the model on multiple GPUs.

device_map = "balanced"
cpu_offload is disabled
libraries are up to date

Anyone got a solution to this?

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

Can you post the complete code, it would be convenient for me to verify

ofcourse, ty for your responding :)
code:

import torch
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

prompt = "A frog is sitting on a leaf in the rain. The rain slowly stops."
image = load_image(image="./input.jpg")
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX-5b-I2V",
torch_dtype=torch.bfloat16,
device_map = "balanced"
)

#pipe.enable_sequential_cpu_offload()
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()

video = pipe(
prompt=prompt,
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
generator=torch.Generator()
).frames[0]

export_to_video(video, "./output.mp4", fps=8)

are you using multi-gpu as well elsewhere in another workflow?

I had this issue, and narrowed it down to every time I generate an image using Multi-GPU, I then have to restart comfyui to then get it to only have one instance running on the 'first' GPU... then it runs fine again.

HTH

Sign up or log in to comment