Congrats for this awesome model!
#2
by
Nexesenex
- opened
I use the ggml version (q4_0) quantized by Mindrage, and I'm amazed!
It's surprisingly "pertinent" for a 13b in terms of writing quality, and I almost feel like I'm testing a 33b model, with the speed of a smaller one. Bravo!
To be fully honest, I wish that the 4096 ctx length training (I hope I use the right word) used by bluemoonrp could be used in such a model (it usually works well with a 2816-3072 ctx in KoboldCPP, for example), in order to make it even more awesome!
This is a great model indeed! Very fast and quality is on par! Would love to see that 4096 ctx length if possible.