Thanks
#2
by
Ixel1
- opened
Just wanting to say thanks for quantising this so that I can fit it on an RTX 3090 GPU. So far, of all the models I've tried this is performing the best for my use case (analysing whether a user message contains certain words or variations of the word to evade detection). It works nice.
Ixel1
changed discussion status to
closed
@Ixel1 , happy to hear, though I should mention that this model is not the very latest. It was quantized with the wikitest parquet file many people use for calibration. I might do 70B 1.2b (2.55bpw and 2.3bpw) later, recent changes broke the quantization/calibration on windows so I might need to switch OS if I want to do more :)