New Exciting quant method

#3
by Yhyu13 - opened

Hi,

@TheBloke this amazing method seems to be fast in generating the quantized model (claimed to be 50x faster than generating GPTQ for llama2 70b) with NO calibration data required. You should pay attention to it

PS
https://mobiusml.github.io/hqq_blog/
https://github.com/oobabooga/text-generation-webui/pull/4888

Mobius Labs GmbH org

Thanks a lot @Yhyu13 ! We are gonna publish a new 2-bit Mixtral quantized model that is much better than this one very soon !

Sign up or log in to comment