Error in the Generation code of the quantized version of the Dorna-Llama3-8B-Instruct-Quantized4Bit

#7
by smasadifar - opened

Hello friends,
I want to use the quantized model for my project, but the code in the model description page in the link below has an error.
https://huggingface.co/amirMohammadi/Dorna-Llama3-8B-Instruct-Quantized4Bit

error.JPG

ValueError Traceback (most recent call last)

in <cell line: 8>()
6
7 tokenizer = AutoTokenizer.from_pretrained(model_path)
----> 8 model = AutoModelForCausalLM.from_pretrained(
9 model_path,
10 torch_dtype=torch.bfloat16,

4 frames

/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
356 if value is not None:
357 if old_value.shape != value.shape:
--> 358 raise ValueError(
359 f'Trying to set a tensor of shape {value.shape} in "{tensor_name}" (which has shape {old_value.shape}), this look incorrect.'
360 )

ValueError: Trying to set a tensor of shape torch.Size([29360128, 1]) in "weight" (which has shape torch.Size([4096, 14336])), this look incorrect.

Please helpe me to solve the error.

Sign up or log in to comment