Qwen/Qwen-14B-Chat-Int4 · autoGPTQ輸出錯誤，請提供模型合併的版本

Oct 31, 2023

在多個主流interface中，如text generation ui，autoGPTQ無法讀取所有模型，並會導致輸出完全錯誤(亂碼)。

autoGPTQ會顯示以下訊息:
INFO:Loading Qwen_Qwen-14B-Chat-Int4...
WARNING:More than one .safetensors model has been found. The last one will be selected. It could be wrong.
INFO:The AutoGPTQ params are: {'model_basename': 'model-00005-of-00005', 'device': 'cuda:0', 'use_triton': False, 'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True, 'trust_remote_code': True, 'max_memory': None, 'quantize_config': None, 'use_cuda_fp16': True, 'disable_exllama': False}

jklj077

Qwen org Dec 21, 2023

请按照说明使用AutoModelForCausalLM.from_pretrained方法加载，请避免使用AutoGPTQ中的from_quantized方法加载。

jklj077 changed discussion status to closed Dec 21, 2023