TheBloke/Ziya-Coding-34B-v1.0-GPTQ · Please tell me about gptq quantisation.

Oct 10, 2023

Thanks for your great work. I would like to ask you about your process of quantification using gptq? I just used autogptq for quantisation, but I can't load it after quantisation using AutoModelForCausalLM.from_pretrained(). It can only be loaded using AutoGPTQForCausalLM.from_quantised, can you tell me where your modification is?

Yhyu13

Oct 10, 2023

Probably just a typo, correct me if I'm wrong, hf transformer does not support gptq since hf has bitsandbytes 4 bit integrated

TheBloke

Owner Oct 10, 2023

Hugging Face Transformers has supported GPTQ for a while now - at least six weeks. All my GPTQ examples use Transformers directly now. It uses AutoGPTQ for the kernels, so AutoGPTQ is still a required install.

@mjw98 Your issue is likely the model name. By default, AutoGPTQ saves the model with a name like gptq-4bit-128g.safetensors. This name cannot be loaded by Transformers. Transformers requires that the model is called model.safetensors.

When making your GPTQ model with AutoGPTQ, pass model_basename="model" and it will work.

Check out my simple AutoGPTQ wrapper script: https://github.com/TheBlokeAI/AIScripts/blob/main/quant_autogptq.py - it will set the basename to model automatically, so the output will be compatible with Transformers.

mjw98

Oct 11, 2023

•

edited Oct 11, 2023

Thank you very much for your suggestion, but I still don't understand where to change the "model_basename", I tried the AutoGPTQ wrapper script you provided, but the output is still gptq-4bit-128g.safetensors. I use the following code，python quant_autogptq.py "/content/llama-1B" "777" "c4" . I used the following code to quantize it and got the result as shown below

My guess is to change the safetensors’ filename and change "model_file_base_name": "gptq_model-4bit-128g" to "model_file_base_name": "model" in quantize_config.json, hope to get your guidance.

mjw98

Oct 11, 2023

kkkk

mjw98 changed discussion status to closed Oct 11, 2023