software for generating quants

#1
by sloshywings - opened

high,

i see you were able to generate more quants than what is available through llama.cpp
can you link the github for the tools u use to make gguf quants?

Which quant we generated is not available in llama.cpp? They should all be as we are using a barely modified private llama.cpp fork to generate them. It would be pointless to generate a quant not supported by official llama.cpp as nobody could run them.

I see, I also just found this, thought I would share:

you can use the ollama --quantize function to create these (additional) quantizations, including k-means quants:

ollama create --quantize q4_K_M mymodel

https://github.com/ollama/ollama/blob/main/docs/import.md

supported quants:
q4_0
q4_1
q5_0
q5_1
q8_0
K-means Quantizations
q3_K_S
q3_K_M
q3_K_L
q4_K_S
q4_K_M
q5_K_S
q5_K_M
q6_K

Can you explain more clearly what you mean? All of the quants we generate are available in llama.cpp, and all of the quants you list as "additional" are available in llama.cpp as well.

Ok, I was just confused, sorry for the confusion. I was just not seeing the k means quants on llama.cpp, found the method parameters, thanks.

sloshywings changed discussion status to closed

makes sense, good luck :)

Sign up or log in to comment