The Q8_0 Quant seems to be broken!

#1
by redrix - opened

Greetings! Once again, thank you for quantizing my models.
I just tried running the Q8_0 GGUF Quant via OobaBooga. The Q6_K Quant runs just fine, without any errors. The Q_0 Quant on the other hand returned the following error:

21:45:02-103604 ERROR    Failed to load the model.                                                                   
Traceback (most recent call last):
  File "/home/redrix/Applications/text-generation-webui/modules/ui_model_menu.py", line 222, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/redrix/Applications/text-generation-webui/modules/models.py", line 93, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/redrix/Applications/text-generation-webui/modules/models.py", line 278, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/redrix/Applications/text-generation-webui/modules/llamacpp_model.py", line 111, in from_pretrained
    result.model = Llama(**params)
                   ^^^^^^^^^^^^^^^
  File "/home/redrix/Applications/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda_tensorcores/llama.py", line 369, in __init__
    internals.LlamaModel(
  File "/home/redrix/Applications/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda_tensorcores/_internals.py", line 56, in __init__
    raise ValueError(f"Failed to load model from file: {path_model}")
ValueError: Failed to load model from file: models/GodSlayer-12B-ABYSS.Q8_0.gguf

Exception ignored in: <function LlamaCppModel.__del__ at 0x7fb68bc0eca0>
Traceback (most recent call last):
  File "/home/redrix/Applications/text-generation-webui/modules/llamacpp_model.py", line 62, in __del__
    del self.model
        ^^^^^^^^^^
AttributeError: 'LlamaCppModel' object has no attribute 'model'

Have you verified that you downloaded the full file? The file should have a SHA-256 of 675cfffa46a9e2ff5dd95981fcb505b8fde1bed5ff9670c3d6afd823e77b410f

Indeed, they are identical. I've downloaded the file twice already, with both having the exact same checksum.

I've downloaded the Q8_0 and it works just fine with llama-cli, so this is either some local/usage problem (out of memory?) or an issue with OobaBooga. Try with llama.cpp and see if that works.

mradermacher changed discussion status to closed

Most likely something with Ooba. Or perhaps Linux is just messing with it. My hardware can run your other Q8_0 Quants just fine.
Thanks though!

Sign up or log in to comment