File size: 3,884 Bytes
a7a7038
a0eb51b
 
a7a7038
a0eb51b
 
a7a7038
58dd59d
 
a7a7038
 
6e5d5af
58dd59d
381dd61
 
58dd59d
 
 
 
 
 
6e5d5af
 
45bcda8
 
6e5d5af
 
381dd61
88eddeb
 
 
fee7a83
88eddeb
 
 
 
f56cefd
fee7a83
6e5d5af
58dd59d
1e64cb2
 
45bcda8
1e64cb2
45bcda8
9fe17e7
 
 
db00281
 
 
 
 
 
6e5d5af
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

Weighted imatrix quantize of https://huggingface.co/sophosympatheia/Midnight-Rose-103B-v2.0.3

The weight was calculated using an experimental (potentially crappy) method that is iterative (thus the "i1"), using 270k semi-random english tokens.

<!-- provided-files -->
static quants are available at https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-GGUF
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ1_S.gguf) | i1-IQ1_S | 22.1 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 27.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ2_XS.gguf) | i1-IQ2_XS | 30.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q2_K.gguf) | i1-Q2_K | 38.3 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 40.6 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 42.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_S.gguf) | i1-Q3_K_S | 44.9 | IQ3_XS probably better |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_M.gguf.split-ab) | i1-Q3_K_M | 50.0 | IQ3_S probably better |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_L.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q3_K_L.gguf.split-ab) | i1-Q3_K_L | 54.5 | IQ3_M probably better |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_S.gguf.split-ab) | i1-Q4_K_S | 59.0 | optimal size/speed/quality |
| [PART 1](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/Midnight-Rose-103B-v2.0.3-i1-GGUF/resolve/main/Midnight-Rose-103B-v2.0.3.i1-Q4_K_M.gguf.split-ab) | i1-Q4_K_M | 62.3 | fast, recommended |


Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->