pszemraj commited on
Commit
ce70db7
·
verified ·
1 Parent(s): b5b4be0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -35,7 +35,10 @@ Below are the weights/file names in this repo:
35
  | flan-ul2-q4k.gguf | q4k | 10.9 |
36
  | flan-ul2-q6k.gguf | q6k | 16 |
37
 
38
- From initial testing, it appears that q2k is too low precision and produces poor/incoherent output. The `q3k` and higher are coherent.
 
 
 
39
 
40
  ## setup
41
 
 
35
  | flan-ul2-q4k.gguf | q4k | 10.9 |
36
  | flan-ul2-q6k.gguf | q6k | 16 |
37
 
38
+ From initial testing:
39
+
40
+ - it appears that q2k is too low precision and produces poor/incoherent output. The `q3k` and higher are coherent.
41
+ - Interestingly, there is no noticeable increase in computation time (_again, on CPU_) when using higher precision quants. I get the same tok/sec for q3k and q6k +/- 0.02
42
 
43
  ## setup
44