kohya-ssdscript

#2
by ABDALLALSWAITI - opened

hi i'm using https://github.com/kohya-ss/sd-scripts to train lora using your https://huggingface.co/Kijai/flux-fp8 float8_e4m3fn version and it works perfect , unfornutly this version not work with this error messege

Traceback (most recent call last):
  File "/home/abdallah/Desktop/webui/sd-scripts/flux_train_network.py", line 519, in <module>
    trainer.train(args)
  File "/home/abdallah/Desktop/webui/sd-scripts/train_network.py", line 354, in train
    model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator)
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abdallah/Desktop/webui/sd-scripts/flux_train_network.py", line 82, in load_target_model
    model = self.prepare_split_model(model, weight_dtype, accelerator)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abdallah/Desktop/webui/sd-scripts/flux_train_network.py", line 127, in prepare_split_model
    flux_upper.to(accelerator.device, dtype=target_dtype)
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1174, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 780, in _apply
    module._apply(fn)
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 805, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1167, in convert
    raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
Traceback (most recent call last):
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/home/abdallah/Desktop/webui/sd-scripts/.venv/lib/python3.12/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/abdallah/Desktop/webui/sd-scripts/.venv/bin/python', 'flux_train_network.py', '--pretrained_model_name_or_path', '/home/abdallah/Desktop/webui/stable-diffusion-webui/models/Stable-diffusion/flux-dev2pro-fp8_e4m3fn_comfy.safetensors'

any idea to fix while this made for lora training

Same. I had the same problem with flux fp8 from xlabs. But it works fine with the Kijai version. ("https://huggingface.co/Kijai/flux-fp8")
This problem is happening on this model too.

Sign up or log in to comment