Update README.md
Browse files
README.md
CHANGED
@@ -34,8 +34,33 @@ inference: false
|
|
34 |
**Description:**
|
35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
**Notes:**
|
38 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
|
|
39 |
|
40 |
The original models can be found [here](https://huggingface.co/BlinkDL/rwkv-4-world), and the original model card can be found below.
|
41 |
|
|
|
34 |
**Description:**
|
35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
36 |
|
37 |
+
# RAM USAGE
|
38 |
+
Model | Starting RAM usage (KoboldCpp)
|
39 |
+
:--:|:--:
|
40 |
+
RWKV-4-World-0.1B.q4_0.bin | 289.3 MiB
|
41 |
+
RWKV-4-World-0.1B.q4_1.bin | 294.7 MiB
|
42 |
+
RWKV-4-World-0.1B.q5_0.bin | 300.2 MiB
|
43 |
+
RWKV-4-World-0.1B.q5_1.bin | 305.7 MiB
|
44 |
+
RWKV-4-World-0.1B.q8_0.bin | 333.1 MiB
|
45 |
+
RWKV-4-World-0.1B.f16.bin | 415.3 MiB
|
46 |
+
|
|
47 |
+
RWKV-4-World-0.4B.q4_0.bin | 484.1 MiB
|
48 |
+
RWKV-4-World-0.4B.q4_1.bin | 503.7 MiB
|
49 |
+
RWKV-4-World-0.4B.q5_0.bin | 523.1 MiB
|
50 |
+
RWKV-4-World-0.4B.q5_1.bin | 542.7 MiB
|
51 |
+
RWKV-4-World-0.4B.q8_0.bin | 640.2 MiB
|
52 |
+
RWKV-4-World-0.4B.f16.bin | 932.7 MiB
|
53 |
+
|
|
54 |
+
RWKV-4-World-1.5B.q4_0.bin | 1.2 GiB
|
55 |
+
RWKV-4-World-1.5B.q4_1.bin | 1.3 GiB
|
56 |
+
RWKV-4-World-1.5B.q5_0.bin | 1.4 GiB
|
57 |
+
RWKV-4-World-1.5B.q5_1.bin | 1.5 GiB
|
58 |
+
RWKV-4-World-1.5B.q8_0.bin | 1.9 GiB
|
59 |
+
RWKV-4-World-1.5B.f16.bin | 3.0 GiB
|
60 |
+
|
61 |
**Notes:**
|
62 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
63 |
+
- KoboldCpp [[bc841ec]](https://github.com/LostRuins/koboldcpp/tree/bc841ec30232036a1e231c0b057689abc3aa00cf) was used to test the model.
|
64 |
|
65 |
The original models can be found [here](https://huggingface.co/BlinkDL/rwkv-4-world), and the original model card can be found below.
|
66 |
|