Oscar Wu
commited on
Commit
·
97f1188
1
Parent(s):
78dc976
Updated README
Browse files
README.md
CHANGED
@@ -14,13 +14,14 @@ This repository contains [`mistralai/Mistral-Small-Instruct-2409`](https://huggi
|
|
14 |
1. **Memory-efficiency:** The full-precision model is around 44 GB, while this xMADified model is only 12 GB, making it feasible to run on a 16 GB GPU.
|
15 |
|
16 |
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [GPTQ](https://github.com/AutoGPTQ/AutoGPTQ)-quantized model, and the full-precision model. The GPTQ model fails on the difficult **MMLU** task, while the xMADai model offers significantly higher accuracy.
|
17 |
-
| Model | Size | MMLU | Arc Challenge | Arc Easy | LAMBADA | WinoGrande | PIQA |
|
18 |
-
|---|---|---|---|---|---|---|---|
|
19 |
-
| mistralai/Mistral-Small-Instruct-2409 | 44.5 GB | 69.48 | 58.79 | 84.72 | 79.06 | 79.08 | 82.43 |
|
20 |
-
| GPTQ Mistral-Small-Instruct-2409 | 12.2 GB | 49.45 | 56.14 | 80.64 | 75.1 | 77.74 | 77.48 |
|
21 |
-
| xMADified Mistral-Small-Instruct-2409 (this model) | 12.2 GB | **68.59** | **57.51** | **82.83** | **77.74** | **79.56** | **81.34** |
|
22 |
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
# How to Run Model
|
26 |
|
@@ -28,8 +29,7 @@ Loading the model checkpoint of this xMADified model requires less than 12 GiB o
|
|
28 |
|
29 |
**Package prerequisites**: Run the following commands to install the required packages.
|
30 |
```bash
|
31 |
-
pip install
|
32 |
-
pip install -q --no-build-isolation auto-gptq
|
33 |
```
|
34 |
|
35 |
**Sample Inference Code**
|
@@ -70,4 +70,4 @@ Here's a sample output of the model, using the code above:
|
|
70 |
|
71 |
# Contact Us
|
72 |
|
73 |
-
For additional xMADified models, access to fine-tuning, and general questions, please contact us at [email protected] and join our waiting list.
|
|
|
14 |
1. **Memory-efficiency:** The full-precision model is around 44 GB, while this xMADified model is only 12 GB, making it feasible to run on a 16 GB GPU.
|
15 |
|
16 |
2. **Accuracy:** This xMADified model preserves the quality of the full-precision model. In the table below, we present the zero-shot accuracy on popular benchmarks of this xMADified model against the [GPTQ](https://github.com/AutoGPTQ/AutoGPTQ)-quantized model, and the full-precision model. The GPTQ model fails on the difficult **MMLU** task, while the xMADai model offers significantly higher accuracy.
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
+
| Model | Size | MMLU | Arc Challenge | Arc Easy | LAMBADA | WinoGrande | PIQA |
|
19 |
+
| -------------------------------------------------- | ------- | --------- | ------------- | --------- | --------- | ---------- | --------- |
|
20 |
+
| xMADified Mistral-Small-Instruct-2409 (this model) | 12.2 GB | **68.59** | **57.51** | **82.83** | **77.74** | **79.56** | **81.34** |
|
21 |
+
| mistralai/Mistral-Small-Instruct-2409 | 44.5 GB | 69.48 | 58.79 | 84.72 | 79.06 | 79.08 | 82.43 |
|
22 |
+
| GPTQ Mistral-Small-Instruct-2409 | 12.2 GB | 49.45 | 56.14 | 80.64 | 75.1 | 77.74 | 77.48 |
|
23 |
+
|
24 |
+
1. **Fine-tuning**: These models are fine-tunable over the same reduced (12 GB) hardware in mere 3-clicks. Watch our product demo [here](https://www.youtube.com/watch?v=S0wX32kT90s&list=TLGGL9fvmJ-d4xsxODEwMjAyNA)
|
25 |
|
26 |
# How to Run Model
|
27 |
|
|
|
29 |
|
30 |
**Package prerequisites**: Run the following commands to install the required packages.
|
31 |
```bash
|
32 |
+
pip install torch==2.4.0 transformers accelerate optimum && pip install -vvv --no-build-isolation "git+https://github.com/PanQiWei/[email protected]"
|
|
|
33 |
```
|
34 |
|
35 |
**Sample Inference Code**
|
|
|
70 |
|
71 |
# Contact Us
|
72 |
|
73 |
+
For additional xMADified models, access to fine-tuning, and general questions, please contact us at [email protected] and join our waiting list.
|