Not working on gpeneox.cpp
I've tried to use the ggml-pythia-160m-deduped-q4_3.bin
but not luck so far. Did you used an specific tool for it? I've tried with both main-gptneox and main-oasst but cannot get it to work: main: seed = 1682903906 gptneox.cpp: loading model from ../../ggml-pythia-160m-deduped-q4_3.bin error loading model: unexpectedly reached end of file gptneox_init_from_file: failed to load model main: error: failed to load model '../../ggml-pythia-160m-deduped-q4_3.bin'
Thanks for informing me about this.
llama.cpp's support for the q4_3 format was removed shortly after my conversions. This may also be the case for gptneox.cpp.
If by tool you mean what I used to run the AI, I used KoboldCpp.
I'll see if I can re-convert them and upload them using a supported format like q5_0, q5_1 and/or q8_0 instead. I'm new to this, haha.
Alright, I tested out gptneox.cpp and Pythia 160M Deduped, and I got the same "unexpectedly reached end of file" error as you.
I tested:
- the 16-bit ggml prior to quantization (I even re-converted it from the original Transformers checkpoints to see if anything changed, but the new conversion gave me the same file as the old)
- this repo's q4_3 model
- the q5_0, q5_1, and q8_0 I've converted and uploaded
Right now, I consider these models unsupported with gptneox.cpp, unfortunately. They work just fine under KoboldCpp.
I understand. Thanks for the feedback. The main idea for me is to use a "main" binary just like all models out there, I'll keep an eye on kobold's library.