metadata
license: apache-2.0
A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.
Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded.
Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow