Not working on gpeneox.cpp

by appvoid - opened May 1, 2023

May 1, 2023

I've tried to use the ggml-pythia-160m-deduped-q4_3.bin but not luck so far. Did you used an specific tool for it? I've tried with both main-gptneox and main-oasst but cannot get it to work:
main: seed = 1682903906 gptneox.cpp: loading model from ../../ggml-pythia-160m-deduped-q4_3.bin error loading model: unexpectedly reached end of file gptneox_init_from_file: failed to load model main: error: failed to load model '../../ggml-pythia-160m-deduped-q4_3.bin'

Crataco

Owner May 1, 2023

•

edited May 1, 2023

Thanks for informing me about this.

llama.cpp's support for the q4_3 format was removed shortly after my conversions. This may also be the case for gptneox.cpp.
If by tool you mean what I used to run the AI, I used KoboldCpp.

I'll see if I can re-convert them and upload them using a supported format like q5_0, q5_1 and/or q8_0 instead. I'm new to this, haha.

Crataco

Owner May 1, 2023

Alright, I tested out gptneox.cpp and Pythia 160M Deduped, and I got the same "unexpectedly reached end of file" error as you.

I tested:

the 16-bit ggml prior to quantization (I even re-converted it from the original Transformers checkpoints to see if anything changed, but the new conversion gave me the same file as the old)
this repo's q4_3 model
the q5_0, q5_1, and q8_0 I've converted and uploaded

Right now, I consider these models unsupported with gptneox.cpp, unfortunately. They work just fine under KoboldCpp.

appvoid

May 1, 2023

I understand. Thanks for the feedback. The main idea for me is to use a "main" binary just like all models out there, I'll keep an eye on kobold's library.

appvoid changed discussion status to closed May 1, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment