Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,8 @@ license: apache-2.0
|
|
4 |
|
5 |
A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.
|
6 |
|
7 |
-
Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded.
|
|
|
|
|
8 |
|
9 |
Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow
|
|
|
4 |
|
5 |
A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.
|
6 |
|
7 |
+
Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded, with a context size of 512.
|
8 |
+
|
9 |
+
The calibration data is just a mix of my personal GPT chats, random words as well as random wikipedia articles, totaling about 15k-ish tokens, definitely not optimal, but it is in the repo for anyone to tinker with, as well as the computed imatrix
|
10 |
|
11 |
Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow
|