Update README.md
Browse files
README.md
CHANGED
@@ -7,21 +7,22 @@ base_model: CohereForAI/c4ai-command-r-plus
|
|
7 |
# Command R+ GGUF
|
8 |
|
9 |
## Description
|
10 |
-
This repository contains GGUF weights
|
11 |
|
12 |
-
##
|
13 |
-
1.
|
14 |
-
|
15 |
-
git clone https://github.com/Carolinabanana/llama.cpp.git llama.cpp-fork
|
16 |
-
cd llama.cpp-fork
|
17 |
-
git reset --hard 8b6577bd631fec33eeadb4b9dfc5a07ed2118148
|
18 |
-
```
|
19 |
-
2. Build it using `make`
|
20 |
-
3. Use it in the same way as the regular `llama.cpp`. If you're unsure of how to start, you can use the following command as a starting point:
|
21 |
```bash
|
22 |
./main -p "<|START_OF_TURN_TOKEN|><|USER_TOKEN|>Who are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>" --color -m /path/to/command-r-plus-Q3_K_L-00001-of-00002.gguf
|
23 |
```
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Merging Weights
|
26 |
After commit `8a28d12`, weights are split with `gguf-split`, which means that you don't have to merge weights. Simply pass the first split, as in the example above, and `llama.cpp` will automatically load all splits. If, for some reason, you want to merge splits, you can use the following command:
|
27 |
```bash
|
|
|
7 |
# Command R+ GGUF
|
8 |
|
9 |
## Description
|
10 |
+
This repository contains GGUF weights for `llama.cpp`. Support for them was added in release [`b2636`](https://github.com/ggerganov/llama.cpp/releases/tag/b2636)
|
11 |
|
12 |
+
## Quickstart
|
13 |
+
1. Ensure that you have release [`b2636`](https://github.com/ggerganov/llama.cpp/releases/tag/b2636) or newer.
|
14 |
+
2. Start with the command below:
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
```bash
|
16 |
./main -p "<|START_OF_TURN_TOKEN|><|USER_TOKEN|>Who are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>" --color -m /path/to/command-r-plus-Q3_K_L-00001-of-00002.gguf
|
17 |
```
|
18 |
|
19 |
+
## Perplexity on `wikitext-2-raw` [WIP]
|
20 |
+
| Test | PPL Value | Standard Deviation |
|
21 |
+
|----------|-----------|--------------------|
|
22 |
+
| Q2_K | 5.7178 | +/- 0.03418 |
|
23 |
+
| Q3_K_L | 4.6214 | +/- 0.02629 |
|
24 |
+
| Q4_K_M | 4.4625 | +/- 0.02522 |
|
25 |
+
|
26 |
## Merging Weights
|
27 |
After commit `8a28d12`, weights are split with `gguf-split`, which means that you don't have to merge weights. Simply pass the first split, as in the example above, and `llama.cpp` will automatically load all splits. If, for some reason, you want to merge splits, you can use the following command:
|
28 |
```bash
|