Vasily Alexeev commited on
Commit
c27917d
β€’
1 Parent(s): e1b1ab8

add quality stats

Browse files
Files changed (1) hide show
  1. README.md +33 -2
README.md CHANGED
@@ -25,6 +25,38 @@ Quantized with [OmniQuant](https://github.com/OpenGVLab/OmniQuant).
25
 
26
  ## Evaluation
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  ## Examples
29
 
30
  ### Imports and Model Loading
@@ -143,8 +175,7 @@ Quantized with [OmniQuant](https://github.com/OpenGVLab/OmniQuant).
143
  ```python
144
  model_path = "compressa-ai/Saiga-Llama-3-8B-OmniQuant"
145
 
146
- model = load_model(model_path)
147
- model.cuda()
148
  tokenizer = AutoTokenizer.from_pretrained(
149
  model_path, use_fast=False, trust_remote_code=True
150
  )
 
25
 
26
  ## Evaluation
27
 
28
+ ### PPL (↓)
29
+
30
+ | | wiki |
31
+ | ------------- | ----- |
32
+ | FP | 7,862 |
33
+ | **Quantized** | 8,615 |
34
+
35
+
36
+ ### Accuracy on English Benchmarks, % (↑)
37
+
38
+ | | piqa | arc_easy | arc_challenge | boolq | hellaswag | winogrande | mmlu_humanities | mmlu_social_sciences | mmlu_stem |
39
+ | ------------- | ---- | -------- | ------------- | ----- | --------- | ---------- | --------------- | -------------------- | --------- |
40
+ | FP | 78,5 | 82,2 | 50,4 | 82,7 | 58,1 | 72,4 | 65,5 | 72,6 | 53,8 |
41
+ | **Quantized** | 78,5 | 80,8 | 47,6 | 81,7 | 56,9 | 71,2 | 62,3 | 68,9 | 49,7 |
42
+
43
+
44
+ ### Accuracy on Russian Benchmarks, % (↑)
45
+
46
+ | | danetqa | terra | rwsd | muserc | rucos | lidirus | parus | rcb | russe | rucola |
47
+ | ------------- | ------- | ----- | ---- | ------ | ----- | ------- | ----- | ---- | ----- | ------ |
48
+ | FP | 74,9 | 52,1 | 51,5 | 55,9 | 58,1 | 59,5 | 69,0 | 34,1 | 38,8 | 67,5 |
49
+ | **Quantized** | 65,4 | 50,5 | 49,5 | 60,7 | 53,7 | 50,9 | 71,0 | 33,6 | 40,8 | 67,5 |
50
+
51
+
52
+ ### Summary
53
+
54
+ | | Average Quality Difference, Eng, % (↑) | Average Quality Difference, Rus, % (↑) | Occupied Memory, % (↓) |
55
+ | ------------- | -------------------------------------- | -------------------------------------- | ---------------------- |
56
+ | FP | 0 | 0 | 100 |
57
+ | **Quantized** | \-2,4 | \-1,8 | 35,7 |
58
+
59
+
60
  ## Examples
61
 
62
  ### Imports and Model Loading
 
175
  ```python
176
  model_path = "compressa-ai/Saiga-Llama-3-8B-OmniQuant"
177
 
178
+ model = load_model(model_path).cuda()
 
179
  tokenizer = AutoTokenizer.from_pretrained(
180
  model_path, use_fast=False, trust_remote_code=True
181
  )