hwarnecke
/

Llama-3.1-SauerkrautLM-70b-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

hwarnecke commited on Aug 5

Commit

c4cf124

•

1 Parent(s): 2dd11ce

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -4,3 +4,11 @@ license: apache-2.0
 # hwarnecke/SauerkrautLM-Nemo-12b-Instruct-Q6_K-GGUF
 This model was converted to GGUF format from [`VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct`](https://huggingface.co/VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct) using llama.cpp.
 Refer to the [original model card](https://huggingface.co/VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct) for more details on the model.

 # hwarnecke/SauerkrautLM-Nemo-12b-Instruct-Q6_K-GGUF
 This model was converted to GGUF format from [`VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct`](https://huggingface.co/VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct) using llama.cpp.
 Refer to the [original model card](https://huggingface.co/VAGOsolutions/Llama-3.1-SauerkrautLM-70b-Instruct) for more details on the model.
+Since HuggingFace supports files up to 50GB, the Q6_K quant is split into two files instead.
+You probably need to merge them again before you can use them. You can use llama.cpp for that.
+Use
+```shell
+./llama-gguf-split -h
+```
+to find out more about the function, after you installed llama.cpp.