Error on load
On Oobabooga, ExLlamav2_HF, default sliders and use_fast ticked:
Traceback (most recent call last):
File "A:\LLaMa\text-generation-webui\modules\ui_model_menu.py", line 210, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "A:\LLaMa\text-generation-webui\modules\models.py", line 85, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "A:\LLaMa\text-generation-webui\modules\models.py", line 363, in ExLlamav2_HF_loader
return Exllamav2HF.from_pretrained(model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "A:\LLaMa\text-generation-webui\modules\exllamav2_hf.py", line 162, in from_pretrained
config.prepare()
File "A:\LLaMa\text-generation-webui\installer_files\env\Lib\site-packages\exllamav2\config.py", line 111, in prepare
with safe_open(st_file, framework = "pt", device = "cpu") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
Your other models (like SynthIA) work fine. Cheers!
Hmm, you're right, there's something wrong with these models. I tested a few and got tensor errors. I'll need to take these down and requant when I get a chance.
Edit: Sorry, I take that back. I had to recompile exllamav2 for my ooba to work. I can load this model just fine. Please try using exllamav2
and not the _HF version. The model loads fine for me locally.
Try installing the latest exllamav2 wheel referenced in this response
Commit dea90c7 states "Bump exllamav2 to 0.0.8". I'm under the assumption that this commit comes with the updated exllamav2 wheels which I won't need to recompile. Same error.
Works for me on Linux. Two things to try:
- Do a fresh installation of ooba, just move your
installer_files
directory toinstaller_files.old
and rerun thestart_windows.bat
file. - Corrupted model? Might need to download the model again.
On Windows 10.
- Reinstalled Ooba on your recommendation, using
git clone
then clicking onstart_windows.bat
to refresh all dependencies. Same error. - Have downloaded the model twice and ensured they threw the same error before I ruled out file corruption and started this thread. Also, Opus 2.4bpw faces the same issue, but your other models that I've tried work fine.
I just re-downloaded the model from Huggingface to make sure it wasn't a corrupted upload. It works fine on my Linux box. It's possibly a bug with Windows; Turboderp mentions fixing some Windows-specific bug with the exllamav2 loader on Discord here:
You may have to wait for an updated Exllamav2 pip package or try under Linux or WSL2 to see if Windows is in fact the issue.
Can't find the developer in the Ooba Discord, but I've followed up with a GitHub issue: https://github.com/turboderp/exllamav2/issues/173
Thanks for your help.
Solved: https://github.com/turboderp/exllamav2/issues/173#issuecomment-1826462410
This seems like a problem with Ooba, in which the UI would report that it's "done" downloading a model when it isn't. Downloading it again would complete the missing parts of the model. That's a separate issue which I will escalate to Oobabooga, but this model at least works now.