interrobang
/

OpenHermes-2.5-Mistral-7B-GGUF-ukrainian-imatrix

Inference Endpoints

Model card Files Files and versions Community

interrobang commited on Feb 27, 2024

Commit

18ac6ba

·

verified ·

1 Parent(s): 820c763

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -4,6 +4,8 @@ license: apache-2.0
 A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.
-Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded.
 Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow

 A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.
+Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded, with a context size of 512.
+The calibration data is just a mix of my personal GPT chats, random words as well as random wikipedia articles, totaling about 15k-ish tokens, definitely not optimal, but it is in the repo for anyone to tinker with, as well as the computed imatrix
 Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow