Llamacpp Quantizations of Meta-Llama-3.1-8B
Using llama.cpp release b3472 for quantization.
Original model: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
Download a file (not the whole branch) from below:
Filename | Quant type | File Size | Perplexity (wikitext-2-raw-v1.test) |
---|---|---|---|
Meta-Llama-3.1-8B-BF16.gguf | BF16 | 16.10GB | 6.4006 +/- 0.03938 |
Meta-Llama-3.1-8B-FP16.gguf | FP16 | 16.10GB | 6.4016 +/- 0.03939 |
Meta-Llama-3.1-8B-Q8_0.gguf | Q8_0 | 8.54GB | 6.4070 +/- 0.03941 |
Meta-Llama-3.1-8B-Q6_K.gguf | Q6_K | 6.60GB | 6.4231 +/- 0.03957 |
Meta-Llama-3.1-8B-Q5_K_M.gguf | Q5_K_M | 5.73GB | 6.4623 +/- 0.03987 |
Meta-Llama-3.1-8B-Q5_K_S.gguf | Q5_K_S | 5.60GB | 6.5161 +/- 0.04028 |
Meta-Llama-3.1-8B-Q4_K_M.gguf | Q4_K_M | 4.92GB | 6.5837 +/- 0.04068 |
Meta-Llama-3.1-8B-Q4_K_S.gguf | Q4_K_S | 4.69GB | 6.6751 +/- 0.04125 |
Meta-Llama-3.1-8B-Q3_K_L.gguf | Q3_K_L | 4.32GB | 6.9458 +/- 0.04329 |
Meta-Llama-3.1-8B-Q3_K_M.gguf | Q3_K_M | 4.02GB | 7.0488 +/- 0.04384 |
Meta-Llama-3.1-8B-Q3_K_S.gguf | Q3_K_S | 3.66GB | 7.8823 +/- 0.04920 |
Meta-Llama-3.1-8B-Q2_K.gguf | Q2_K | 3.18GB | 9.7262 +/- 0.06393 |
Benchmark Results
Results have been computed using:
Benchmark | Quant type | Metric |
---|---|---|
WinoGrande (0-shot) | Q8_0 | 74.1121 +/- 1.2311 |
WinoGrande (0-shot) | Q4_K_M | 73.1650 +/- 1.2453 |
WinoGrande (0-shot) | Q3_K_M | 72.7703 +/- 1.2511 |
WinoGrande (0-shot) | Q3_K_S | 72.3757 +/- 1.2567 |
WinoGrande (0-shot) | Q2_K | 68.4294 +/- 1.3063 |
HellaSwag (0-shot) | Q8_0 | 79.41645091 |
HellaSwag (0-shot) | Q4_K_M | 79.05795658 |
HellaSwag (0-shot) | Q3_K_M | 79.41645091 |
HellaSwag (0-shot) | Q3_K_S | 76.93686517 |
HellaSwag (0-shot) | Q2_K | 72.16689902 |
MMLU (0-shot) | Q8_0 | 39.4703 +/- 1.2427 |
MMLU (0-shot) | Q4_K_M | 39.5349 +/- 1.2431 |
MMLU (0-shot) | Q3_K_M | 38.8889 +/- 1.2394 |
MMLU (0-shot) | Q3_K_S | 37.2739 +/- 1.2294 |
MMLU (0-shot) | Q2_K | 35.4651 +/- 1.2163 |
Downloading using huggingface-cli
First, make sure you have hugginface-cli installed:
pip install -U "huggingface_hub[cli]"
Then, you can target the specific file you want:
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q4_K_M.gguf" --local-dir ./
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
huggingface-cli download fedric95/Meta-Llama-3.1-8B-GGUF --include "Meta-Llama-3.1-8B-Q8_0.gguf/*" --local-dir Meta-Llama-3.1-8B-Q8_0
You can either specify a new local-dir (Meta-Llama-3.1-8B-Q8_0) or download them all in place (./)
Reproducibility
https://github.com/ggerganov/llama.cpp/issues/8650#issuecomment-2261497976
- Downloads last month
- 854
Model tree for fedric95/Meta-Llama-3.1-8B-GGUF
Base model
meta-llama/Llama-3.1-8B