Clarify what is what, please.

#1
by sphiratrioth666 - opened

If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?

Is that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?

Thx for clarification and keep up the good work!

Owner

If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?

q8_0 is quantized f16 for outpt and embed and q8_0 for the others.
q8_p is quantized using the flag --pure in the quantization program

Is that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?

you are right, the naming is confusing because I changed it in time and I'm too lazy to change them all back.

Thx for clarification and keep up the good work!

everything is explained here: https://huggingface.co/RobertSinclair

If I understand correctly, all your files are quantized this way? You're saying that f16 Q5 and Q6 are better than Q8_0. You provide files called f16, then just q5 and q6 and then q8_0, q8_p. So what is what?

q8_0 is quantized f16 for outpt and embed and q8_0 for the others.
q8_p is quantized using the flag --pure in the quantization program

Is that just q5 file and just q6 file as smart as q16 file? Please, just clarify what is what since I cannot follow your naming pattern when you start differentiating in that particular method you're using. It might be a great improvement over other quantizations but still - what is what?

you are right, the naming is confusing because I changed it in time and I'm too lazy to change them all back.

Thx for clarification and keep up the good work!

everything is explained here: https://huggingface.co/RobertSinclair

Ok! Thx. Great job, again - and thank you.

Sign up or log in to comment