Finetuning error: `config_class` attribute that is not consistent with the config class you passed

#2
by anothercoder2 - opened

Hello
I am trying to fine tune minicpm-guidance

From the minicpm library I update finetune_lora.sh from
MODEL="openbmb/MiniCPM-Llama3-V-2_5"
to
MODEL="RhapsodyAI/minicpm-guidance"

and finetune.py to
class ModelArguments:
#model_name_or_path: Optional[str] = field(default="openbmb/MiniCPM-V-2")
model_name_or_path: Optional[str] = field(default="RhapsodyAI/minicpm-guidance"

I am getting this error when I trun finetune_lora.sh

[2024-07-21 23:53:36,707] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-07-21 23:53:36,707] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
[rank0]: Traceback (most recent call last):
[rank0]: File "/workspace/minicpm/finetune/finetune.py", line 282, in
[rank0]: train()
[rank0]: File "/workspace/minicpm/finetune/finetune.py", line 185, in train
[rank0]: model = AutoModel.from_pretrained(
[rank0]: File "/workspace/pypacks/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
[rank0]: cls.register(config.class, model_class, exist_ok=True)
[rank0]: File "/workspace/pypacks/transformers/models/auto/auto_factory.py", line 584, in register
[rank0]: raise ValueError(
[rank0]: ValueError: The model class you are passing has a config_class attribute that is not consistent with the config class you passed (model has None and you passed <class 'transformers_modules.RhapsodyAI.minicpm-guidance.854aaf385ec56b8e738161b4aedb3eed5d28f9bc.configuration_minicpmv.MiniCPMVConfig'>. Fix one of those so they match!

Can you please help? or atleast provide your finetuning files on github?

Thank you!

Rhapsody org

It looks like something is wrong when loading the model, but that's strange, have you tried using this model directly?

For inference:
Downloading the model directly from github works.
Try to load the model from HF for inference gives the same error.

I download the model from github and tried to use it for training
I updated the minicpm finetune_lora.sh (from minicpm github)
from
MODEL="openbmb/MiniCPM-Llama3-V-2_5"
to
MODEL="/my/local/directory"

and updated the finetune.py
from
model_name_or_path: Optional[str] = field(default="openbmb/MiniCPM-V-2")
to
model_name_or_path: Optional[str] = field(default="/my/local/directory")

on starting the ./finetune_lora.sh I get this error now:

[2024-07-24 00:13:27,370] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-07-24 00:13:27,370] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
Currently using LoRA for fine-tuning the MiniCPM-V model.
[rank0]: Traceback (most recent call last):
[rank0]: File "/workspace/minicpm/finetune/finetune.py", line 283, in
[rank0]: train()
[rank0]: File "/workspace/minicpm/finetune/finetune.py", line 231, in train
[rank0]: model.enable_input_require_grads()
[rank0]: File "/workspace/pypacks/transformers/modeling_utils.py", line 1686, in enable_input_require_grads
[rank0]: self._require_grads_hook = self.get_input_embeddings().register_forward_hook(make_inputs_require_grads)
[rank0]: File "/workspace/pypacks/transformers/modeling_utils.py", line 1705, in get_input_embeddings
[rank0]: raise NotImplementedError
[rank0]: NotImplementedError
E0724 00:13:46.323000 139822901047296 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 3170) of binary: /usr/bin/python
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
sys.exit(main())
File "/workspace/pypacks/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper
return f(*args, **kwargs)
File "/workspace/pypacks/torch/distributed/run.py", line 879, in main
run(args)
File "/workspace/pypacks/torch/distributed/run.py", line 870, in run
elastic_launch(
File "/workspace/pypacks/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/workspace/pypacks/torch/distributed/launcher/api.py", line 263, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

finetune.py FAILED

@Cuiunbo can you please help?

Rhapsody org

[rank0]: File "/workspace/minicpm/finetune/finetune.py", line 185, in train
[rank0]: model = AutoModel.from_pretrained(

hi! Sorry for the late reply.
I tried the line where you reported the error, and found that I can not reproduce it, can you check to run the example code provided by huggingface alone and make sure it works?

Sign up or log in to comment