Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,27 @@
|
|
2 |
language:
|
3 |
- en
|
4 |
---
|
5 |
-
## To request a quant, open an new discussion in the Community tab (preferably with the url in the title)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
---
|
5 |
+
## To request a quant, open an new discussion in the Community tab (preferably with the url in the title)
|
6 |
+
|
7 |
+
## Mini-FAQ
|
8 |
+
|
9 |
+
- What does the "-i1" mean in "-i1-GGUF"?
|
10 |
+
|
11 |
+
- Why are you doing this?
|
12 |
+
|
13 |
+
- You have amazing hardware!?!?!
|
14 |
+
|
15 |
+
- Why don't you use gguf-split?
|
16 |
+
|
17 |
+
TL;DR: I don't have the hardware/resources for that.
|
18 |
+
|
19 |
+
Long answer: gguf-split requires a full copy for every quant.
|
20 |
+
Unlike what many people think, my hardware is rather outdated and not very fast. The extra processing that gguf-split requires
|
21 |
+
either runs out of space on my systems with fast disk, or takes a very long time and a lot of I/O bandwidth on the slower
|
22 |
+
disks, all of which already run at their limits. Supporting gguf-split would mean
|
23 |
+
|
24 |
+
While this is the blocking reason, I also find it less than ideal that yet another incompatible file format was created that
|
25 |
+
requires special tools to manage, instead of supporting the tens of thousands of existing quants, of which the vast majority
|
26 |
+
could just be mmapped together into memory from split files. That doesn't keep me from supporting it, but it would have
|
27 |
+
been nice to look at the existing reality and/or consult the community before throwing yet another hard to support format out
|
28 |
+
there without thinking.
|