onebitquantized
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -13,11 +13,12 @@ This repository contains [`mistralai/Mistral-Small-Instruct-2409`](https://huggi
|
|
13 |
|
14 |
1. **Memory-efficiency:** The full-precision model is around 44 GB, while this xMADified model is only 12 GB, making it feasible to run on a 16 GB GPU.
|
15 |
|
16 |
-
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [GPTQ](https://github.com/AutoGPTQ/AutoGPTQ)-quantized model
|
17 |
-
| Model | MMLU | Arc Challenge | Arc Easy | LAMBADA | WinoGrande | PIQA |
|
18 |
-
|
19 |
-
|
|
20 |
-
|
|
|
|
21 |
|
22 |
# How to Run Model
|
23 |
|
|
|
13 |
|
14 |
1. **Memory-efficiency:** The full-precision model is around 44 GB, while this xMADified model is only 12 GB, making it feasible to run on a 16 GB GPU.
|
15 |
|
16 |
+
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [GPTQ](https://github.com/AutoGPTQ/AutoGPTQ)-quantized model, and the full-precision model. The GPTQ model fails on the difficult **MMLU** task, while the xMADai model offers significantly higher accuracy.
|
17 |
+
| Model | Size | MMLU | Arc Challenge | Arc Easy | LAMBADA | WinoGrande | PIQA |
|
18 |
+
|---|---|---|---|---|---|---|---|
|
19 |
+
| mistralai/Mistral-Small-Instruct-2409 | 44.5 GB | 69.48 | 58.79 | 84.72 | 79.06 | 79.08 | 82.43 |
|
20 |
+
| GPTQ Mistral-Small-Instruct-2409 | 12.2 GB | 49.45 | 56.14 | 80.64 | 75.1 | 77.74 | 77.48 |
|
21 |
+
| xMADified Mistral-Small-Instruct-2409 (this model) | 12.2 GB | **68.59** | **57.51** | **82.83** | **77.74** | **79.56** | **81.34** |
|
22 |
|
23 |
# How to Run Model
|
24 |
|