cortecs
/

Llama-3-SauerkrautLM-70b-Instruct-GPTQ

@@ -3,9 +3,9 @@ datasets: LeoLM/wikitext-en-de
 license: other
 license_link: https://llama.meta.com/llama3/license/
 ---
-This is a quantized model of [SKLM Llama-3 70B Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-70b-Instruct) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
  using the following configuration:
- - 4bit (8bit will follow)
 - Act order: True
  - Group size: 128
@@ -20,34 +20,35 @@ Access the model:
 ```
 curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
         "model": "cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ",
-        "prompt": "Berlin ist eine"
     } '
 ```
 ## Evaluations
-| __English__   | __SKLM Llama-3 70B Instruct__   | __SKLM Llama-3 70B Instruct GPTQ__   | __Llama-3 70B Instruct__   |
-|:--------------|:--------------------------------|:-------------------------------------|:---------------------------|
-| Avg.          | 78.17                           | 76.72                                | 76.19                      |
-| ARC           | 74.5                            | 73.0                                 | 71.6                       |
-| Hellaswag     | 79.2                            | 78.0                                 | 77.3                       |
-| MMLU          | 80.8                            | 79.15                                | 79.66                      |
-|               |                                 |                                      |                            |
-| __German__   | __SKLM Llama-3 70B Instruct__   | __SKLM Llama-3 70B Instruct GPTQ__   | __Llama-3 70B Instruct__   |
-| Avg.         | 70.83                           | 69.13                                | 68.43                      |
-| ARC_de       | 66.7                            | 65.9                                 | 64.2                       |
-| Hellaswag_de | 70.8                            | 68.8                                 | 67.8                       |
-| MMLU_de      | 75.0                            | 72.7                                 | 73.3                       |
-|              |                                 |                                      |                            |
-| __Safety__          |   __SKLM Llama-3 70B Instruct__ |   __SKLM Llama-3 70B Instruct GPTQ__ |   __Llama-3 70B Instruct__ |
-| Avg.                |                           65.86 |                                65.94 |                      64.28 |
-| RealToxicityPrompts |                           97.6  |                                98.4  |                      97.9  |
-| TruthfulQA          |                           67.07 |                                65.56 |                      61.91 |
-| CrowS               |                           32.92 |                                33.87 |                      33.04 |
-Take with caution. We did not check for data contamination.
-     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000` for big datasets.
 ## Performance
 |               |   requests/s |   tokens/s |
 |:--------------|-------------:|-----------:|
 | NVIDIA L40Sx2 |         2.19 |    1044.76 |

 license: other
 license_link: https://llama.meta.com/llama3/license/
 ---
+This is a quantized model of [Llama-3-SauerkrautLM-70b-Instruct](https://huggingface.co/VAGOSolutions/Llama-3-SauerkrautLM-70b-Instruct) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
  using the following configuration:
+ - 4bit
 - Act order: True
  - Group size: 128
 ```
 curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
         "model": "cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ",
+        "prompt": "San Francisco is a"
     } '
 ```
 ## Evaluations
+| __English__   | __[Llama-3-SauerkrautLM-70b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-70b-Instruct)__   | __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b)__   | __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ)__   |
+|:--------------|:------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------|
+| Avg.          | 78.17                                                                                                             | 78.1                                                                                                                        | 76.72                                                                                                                 |
+| ARC           | 74.5                                                                                                              | 74.4                                                                                                                        | 73.0                                                                                                                  |
+| Hellaswag     | 79.2                                                                                                              | 79.2                                                                                                                        | 78.0                                                                                                                  |
+| MMLU          | 80.8                                                                                                              | 80.7                                                                                                                        | 79.15                                                                                                                 |
+|               |                                                                                                                   |                                                                                                                             |                                                                                                                       |
+| __German__   | __[Llama-3-SauerkrautLM-70b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-70b-Instruct)__   | __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b)__   | __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ)__   |
+| Avg.         | 70.83                                                                                                             | 70.47                                                                                                                       | 69.13                                                                                                                 |
+| ARC_de       | 66.7                                                                                                              | 66.2                                                                                                                        | 65.9                                                                                                                  |
+| Hellaswag_de | 70.8                                                                                                              | 71.0                                                                                                                        | 68.8                                                                                                                  |
+| MMLU_de      | 75.0                                                                                                              | 74.2                                                                                                                        | 72.7                                                                                                                  |
+|              |                                                                                                                   |                                                                                                                             |                                                                                                                       |
+| __Safety__          |   __[Llama-3-SauerkrautLM-70b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-70b-Instruct)__ |   __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ-8b)__ |   __[Llama-3-SauerkrautLM-70b-Instruct-GPTQ](https://huggingface.co/cortecs/Llama-3-SauerkrautLM-70b-Instruct-GPTQ)__ |
+| Avg.                |                                                                                                             65.86 |                                                                                                                       65.94 |                                                                                                                 65.94 |
+| RealToxicityPrompts |                                                                                                             97.6  |                                                                                                                       97.8  |                                                                                                                 98.4  |
+| TruthfulQA          |                                                                                                             67.07 |                                                                                                                       66.92 |                                                                                                                 65.56 |
+| CrowS               |                                                                                                             32.92 |                                                                                                                       33.09 |                                                                                                                 33.87 |
+We did not check for data contamination.
+     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000`.
 ## Performance
 |               |   requests/s |   tokens/s |
 |:--------------|-------------:|-----------:|
 | NVIDIA L40Sx2 |         2.19 |    1044.76 |
+Performance measured on [cortecs inference](https://cortecs.ai).