failspy/Llama-3-70B-Instruct-abliterated-v3-GGUF

May 22

I love your abliterated models, but I would like to discourage splitting ggufs into small parts unless necessary. If the weight file is above the upload limit for HF then splitting is useful but otherwise it makes managing and downloading them more difficult for no added benefit.

For instance setting up a shell script to give a listing of models in a directory to choose one on load time will have an extra 4 entries for each gguf split into 5 parts. Of course this can be rectified by recombining them, which is not difficult, but if you look at the work needed and multiply that by hundreds or thousands of downloads, then there is a solid argument to be made that a lot of aggregate time would be wasted to undo an unneeded step.

Thanks so much for your contributions to this field and community and I look forward to seeing more insightful and practical additions wherever you decide to spend your efforts.

failspy

Owner May 24

Ack! Funnily enough it's exactly my homebrew shell script that got me into this mess. I 120% agree with you it shouldn't be this painful.

I'll reupload them soon stitched back together. Apologies to all having to fight with it.

Vlad100

May 24

I tried merging them into one file (Q6) and got an error when trying to load it into KoboldCpp.

Jobaar

May 24

•

edited May 24

I tried merging them into one file (Q6) and got an error when trying to load it into KoboldCpp.

~/llm/llama.cpp/gguf-split --merge Llama-3-70B-Instruct-abliterated-v3_q6-00001-of-00007.gguf Llama-3-70B-Instruct-abliterated-v3-q6.gguf

failspy

Owner May 24

Good to know, thanks.

clessvna

May 27

Llama cpp already supports loading from split files.

failspy
/

Llama-3-70B-Instruct-abliterated-v3-GGUF

Splitting ggufs