Doubts about using the AutoGPTQ conversion model

#12
by coyude - opened

Thank you very much, TheBloke, for your work. I have previously used many GPTQ models created by you. Now, I want to try synthesizing my own GPTQ model. Currently, I am trying to use the GPTQ-for-LLaMa synthesis model, but it is only compatible with GPTQ-for-LLaMa. When I load it with AutoGPTQ, it generates incoherent responses. Is there a way to make the model compatible with both GPTQ-for-LLaMa and AutoGPTQ? Thank you very much!😊

AutoGPTQ should support models made with GPTQ-for-LLaMa. Did you create a quantize_config.json file, or in Python code manually pass in a BaseQuantizeConfig() set up appropriately?

Gibberish output usually occurs when you have a mismatch on the desc_act/--act-order setting. Eg if you made the model with --act-order in GPTQ-for-LLaMa, but then didn't set "desc_act": true in quantize_config.json for AutoGPTQ. Or vice versa.

AutoGPTQ should support models made with GPTQ-for-LLaMa. Did you create a quantize_config.json file, or in Python code manually pass in a BaseQuantizeConfig() set up appropriately?

Gibberish output usually occurs when you have a mismatch on the desc_act/--act-order setting. Eg if you made the model with --act-order in GPTQ-for-LLaMa, but then didn't set "desc_act": true in quantize_config.json for AutoGPTQ. Or vice versa.

Thank you very much!πŸ˜πŸ‘ The problem has been resolved.

coyude changed discussion status to closed

Sign up or log in to comment