quants upload

Llama-3.1-Minitron-4B-Width-Base Q4_0_4_4 quant

Files changed (3) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Llama-3.1-Minitron-4B-Width-Base-Q4_0_4_4.gguf filter=lfs diff=lfs merge=lfs -text

Llama-3.1-Minitron-4B-Width-Base-Q4_0_4_4.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6acecb42163a5ead82ca9494c86e494ed902cfdefe4b0a8b61fce85e7e643782
+size 2648521376

README.md ADDED Viewed

+---
+base_model: nvidia/Llama-3.1-Minitron-4B-Width-Base
+license: other
+license_name: nvidia-open-model-license
+license_link: >-
+  https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
+inference: false
+---
+# Llama-3.1-Minitron-4B-Width-Base
+ExLlamav2 8 bpw quant of https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base