Spammed exclamation marks
Strange, I'm not sure if this was on my side or was the quant, but when I tried the Q4_K_M is just spammed ! when responding. That didn't happen when I used https://huggingface.co/mradermacher/Llama-3.1-70B-ArliAI-RPMax-v1.1-GGUF though.
Yea I think the GGUF quants are broken, you should try the GPTQ or full weights. Or on our API.
I will reupload with fixed quants
Did you find what was wrong with the quants?
Did you get any error messages when running llama-quanrize
about NaN
or similar?
I was just about to download the HF repo to train the control vectors on but might be worth holding out if there is a problem with GGUF conversation or else they won't be usable.
I usually did my quants on my windows machine with the pre built .exe and it worked for my smaller models. But on this 70B my windows machine didn't have enough RAM lol so I used my linux training machine and idk if I did it right tbh. There was no errors though. The GGUF files on the repo now are the ones made by mradermacher.
I'm just running the control vector training (using HF transformers) and no sign of any problems like zero-valued tensors, so not sure why people are having problems converting this to GGUF and exl2 (somebody found a tensor in layer 15 had zero error IIRC).