Possible to convert to GGUF?
#1
by
jackboot
- opened
You think it would run on llama.cpp?
GGUF is not just a format, but rather a framework, which, most certainly, doesn't support this model.
Did you try GPTQ as well? Or the only option is to load in 4 bit with bnb?
GPTQ is also very model-specific. I do not know or any out-of-the-box quantization solutions apart from BnB.