File size: 5,998 Bytes
a7a7038
4d61949
a0eb51b
 
a7a7038
a0eb51b
 
a7a7038
58dd59d
 
a7a7038
 
6e5d5af
58dd59d
381dd61
 
58dd59d
 
 
 
 
 
6e5d5af
 
45bcda8
 
6e5d5af
 
381dd61
88eddeb
 
2270a89
88eddeb
fee7a83
88eddeb
a8ceb24
88eddeb
5c4273e
88eddeb
 
64a44bb
 
f56cefd
fee7a83
64a44bb
 
 
6e5d5af
58dd59d
1e64cb2
 
45bcda8
1e64cb2
45bcda8
9fe17e7
 
 
db00281
 
 
 
 
 
6e5d5af
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
exported_from: sophosympatheia/Midnight-Rose-103B-v2.0.3
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

Weighted imatrix quantize of https://huggingface.co/sophosympatheia/Midnight-Rose-103B-v2.0.3

The weight was calculated using an experimental (potentially crappy) method that is iterative (thus the "i1"), using 270k semi-random english tokens.

<!-- provided-files -->
static quants are available at https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-GGUF
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ1_S.gguf) | i1-IQ1_S | 22.1 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 27.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ2_XS.gguf) | i1-IQ2_XS | 30.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ2_M.gguf) | i1-IQ2_M | 35.1 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q2_K.gguf) | i1-Q2_K | 38.3 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 40.6 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 42.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ3_XS.gguf) | i1-IQ3_XS | 42.6 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_S.gguf) | i1-Q3_K_S | 44.9 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ3_S.gguf) | i1-IQ3_S | 45.0 | beats Q3_K* |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_M.gguf.split-ab) | i1-Q3_K_M | 50.0 | IQ3_S probably better |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_L.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_L.gguf.split-ab) | i1-Q3_K_L | 54.5 | IQ3_M probably better |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ4_XS.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ4_XS.gguf.part2of2) | i1-IQ4_XS | 55.5 |  |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_0.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_0.gguf.part2of2) | i1-Q4_0 | 58.8 |  |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_S.gguf.split-ab) | i1-Q4_K_S | 59.0 | optimal size/speed/quality |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_M.gguf.split-ab) | i1-Q4_K_M | 62.3 | fast, recommended |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q5_K_S.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q5_K_S.gguf.part2of2) | i1-Q5_K_S | 71.4 |  |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q5_K_M.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q5_K_M.gguf.part2of2) | i1-Q5_K_M | 73.3 |  |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 85.1 | practically like static Q6_K |


Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->