Please make i1 quants of my latest 72b model

by rombodawg - opened 6 days ago

6 days ago

I really appreciate your work. I just released my best model yet. I would love it if you could make some i1 quants of it.

https://huggingface.co/Rombo-Org/Rombo-LLM-V3.0-Qwen-72b

mradermacher

Owner 6 days ago

Sure! it's queued, would be a shame if we didn't have quants for it :)

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Rombo-LLM-V3.0-Qwen-72b-GGUF for quants to appear.

mradermacher changed discussion status to closed 6 days ago

Smorty100

4 days ago

@rombodawg hi! i would like to know why we would use i1 quants at all... like - the general consensis i think is that Q4 is pretty good, and anything below is kinda not worth it anymore, where just getting a smaller model with a Q4 makes more sense performance-wise.

i have heard of those 1.58 bit quants, which apparently perform surprisingly well, but im assuming those are not the same...

would you mind explaining it to me? I am very curious :)

mradermacher

Owner 4 days ago

i1-IQ4_XS is better quality then Q4_K_M at 20% smaller size. for example. and i1-IQ1_S quants are ~1.58bpw, and perform "surprisingly good" (but much worse than Q4_K).

Smorty100

3 days ago

•

edited 3 days ago

@mradermacher oooh so i1 quant is not just a "one bit quant"? but some interesting inbetween? that is super cool! :o

i is that some new fancy quant method i completely missed? or has it just - always been around

(also, please @me ,otherwise i don't get the notification)

mradermacher

Owner 3 days ago

•

edited 3 days ago

@Smorty100 No, it's just our naming convention for imatrix quants.

rombodawg

2 days ago

@mradermacher This model looks like a banger, can we get i1 quants of it please? Its uncensored r1 distill 70b. No china censorship
https://huggingface.co/perplexity-ai/r1-1776-distill-llama-70b

mradermacher

Owner 2 days ago

Queued, on our newest, experimental, quant node, too. Keep your fingers crossed that it works.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#r1-1776-distill-llama-70b-GGUF for quants to appear.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment