Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,11 @@ Quantized version of this: https://huggingface.co/ausboss/llama-30b-supercot
|
|
3 |
GPTQ quantization using https://github.com/0cc4m/GPTQ-for-LLaMa for compatibility with 0cc4m's fork of KoboldAI
|
4 |
|
5 |
Command used to quantize:
|
6 |
-
```python llama.py c:\llama-30b-supercot c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
GPTQ quantization using https://github.com/0cc4m/GPTQ-for-LLaMa for compatibility with 0cc4m's fork of KoboldAI
|
4 |
|
5 |
Command used to quantize:
|
6 |
+
```python llama.py c:\llama-30b-supercot c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors```
|
7 |
+
|
8 |
+
Evaluation & Score (Lower is better):
|
9 |
+
* WikiText2: 4.51
|
10 |
+
* PTB: 17.46
|
11 |
+
* C4: 6.37
|
12 |
+
|
13 |
+
Non-groupsize version is here: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-cuda
|