Dynamic bnb-4bit

by iqdddd - opened 5 days ago

Discussion

iqdddd

5 days ago

If I may ask, is there a plan to publish code for quantization in Dynamic 4-bit, or to perhaps offer it for sale?

shimmyshimmer

Unsloth AI org 4 days ago

If I may ask, is there a plan to publish code for quantization in Dynamic 4-bit, or to perhaps offer it for sale?

Each model requires different quantization. We open-sourced the dynamic quants for our R1-GGUF repo here: https://github.com/unslothai/llama.cpp

TobDeBer

4 days ago

@shimmyshimmer If I may ask, why did you (unsloth) name it "dynamic quant"? What's dynamic about it?
What I can see is that the quants are customized per layer and tensor type.

Also, I didn't find explicit mention of the superweight handling. Is that done implicit in the imatrix computation?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment