Error in the Generation code of the quantized version of the Dorna-Llama3-8B-Instruct-Quantized4Bit

#1
by smasadifar - opened

Hello friends,
I want to use the quantized model for my project, but the code in the model description page in the link below has an error.
https://huggingface.co/amirMohammadi/Dorna-Llama3-8B-Instruct-Quantized4Bit

error.JPG

Please helpe me to solve the error.

smasadifar changed discussion status to closed
smasadifar changed discussion status to open

Hi,

While this repository is not managed by us and there could be various reasons for the error, which the main author would be responsible for addressing, we did take a look at the repository. We could reproduce your error. It appears that there is a missing configuration in the config.json file. Specifically, the file lacks information for quantization_config, which is causing a requirement when the safetensor files are quantized using bitsandbytes.

You may consider using Dorna GGUF formats. Additionally, you can use bitsandbytes to load it in 4 bits, which is a simple solution.

Sign up or log in to comment