add quality stats

Files changed (1) hide show

README.md +33 -2

README.md CHANGED Viewed

@@ -25,6 +25,38 @@ Quantized with [OmniQuant](https://github.com/OpenGVLab/OmniQuant).
 ## Evaluation
 ## Examples
 ### Imports and Model Loading
@@ -143,8 +175,7 @@ Quantized with [OmniQuant](https://github.com/OpenGVLab/OmniQuant).
 ```python
 model_path = "compressa-ai/Saiga-Llama-3-8B-OmniQuant"
-model = load_model(model_path)
-model.cuda()
 tokenizer = AutoTokenizer.from_pretrained(
     model_path, use_fast=False, trust_remote_code=True
 )

 ## Evaluation
+### PPL (↓)
+|               | wiki  |
+| ------------- | ----- |
+| FP            | 7,862 |
+| **Quantized** | 8,615 |
+### Accuracy on English Benchmarks, % (↑)
+|               | piqa | arc_easy | arc_challenge | boolq | hellaswag | winogrande | mmlu_humanities | mmlu_social_sciences | mmlu_stem |
+| ------------- | ---- | -------- | ------------- | ----- | --------- | ---------- | --------------- | -------------------- | --------- |
+| FP            | 78,5 | 82,2     | 50,4          | 82,7  | 58,1      | 72,4       | 65,5            | 72,6                 | 53,8      |
+| **Quantized** | 78,5 | 80,8     | 47,6          | 81,7  | 56,9      | 71,2       | 62,3            | 68,9                 | 49,7      |
+### Accuracy on Russian Benchmarks, % (↑)
+|               | danetqa | terra | rwsd | muserc | rucos | lidirus | parus | rcb  | russe | rucola |
+| ------------- | ------- | ----- | ---- | ------ | ----- | ------- | ----- | ---- | ----- | ------ |
+| FP            | 74,9    | 52,1  | 51,5 | 55,9   | 58,1  | 59,5    | 69,0  | 34,1 | 38,8  | 67,5   |
+| **Quantized** | 65,4    | 50,5  | 49,5 | 60,7   | 53,7  | 50,9    | 71,0  | 33,6 | 40,8  | 67,5   |
+### Summary
+|               | Average Quality Difference, Eng, % (↑) | Average Quality Difference, Rus, % (↑) | Occupied Memory, % (↓) |
+| ------------- | -------------------------------------- | -------------------------------------- | ---------------------- |
+| FP            | 0                                      | 0                                      | 100                    |
+| **Quantized** | \-2,4                                  | \-1,8                                  | 35,7                   |
 ## Examples
 ### Imports and Model Loading
 ```python
 model_path = "compressa-ai/Saiga-Llama-3-8B-OmniQuant"
+model = load_model(model_path).cuda()
 tokenizer = AutoTokenizer.from_pretrained(
     model_path, use_fast=False, trust_remote_code=True
 )