metadata
base_model:
- 152334H/miqu-1-70b-sf
- lizpreciatior/lzlv_70b_fp16_hf
language:
- en
library_name: transformers
quantized_by: mradermacher
tags:
- mergekit
- merge
About
static quants of https://huggingface.co/wolfram/miquliz-120b-v2.0
weighted/imatrix quants available at https://huggingface.co/mradermacher/miquliz-120b-v2.0-i1-GGUF
While other static and imatrix quants are available already, I wanted a wider selection of quants available for this model.
Usage
If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.
Provided Quants
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Link | Type | Size/GB | Notes |
---|---|---|---|
GGUF | Q2_K | 44.5 | |
PART 1 PART 2 | Q3_K_XS | 49.2 | |
PART 1 PART 2 | Q3_K_S | 52.1 | |
PART 1 PART 2 | Q3_K_M | 58.1 | lower quality |
PART 1 PART 2 | Q3_K_L | 63.3 | |
PART 1 PART 2 | Q4_K_S | 68.6 | fast, medium quality |
PART 1 PART 2 | IQ4_NL | 68.7 | fast, slightly worse than Q4_K_S |
PART 1 PART 2 | Q4_K_M | 72.5 | fast, medium quality |
PART 1 PART 2 | Q5_K_S | 83.1 | |
PART 1 PART 2 | Q5_K_M | 85.3 | |
PART 1 PART 2 PART 3 | Q6_K | 99.0 | very good quality |
PART 1 PART 2 PART 3 | Q8_0 | 128.1 | fast, best quality |
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):