File size: 6,229 Bytes
d17decc
 
 
 
50bf79b
 
 
d17decc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b74f18b
d17decc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
base_model: [deepseek-ai/DeepSeek-V2-Chat-0628]
---

#### πŸš€ Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference of currently the #7 model globally on lmsys arena hard! πŸ–₯️

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/rbdug3j6BaeTSmKLDIp39.png)

### 🧠 This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs.
### πŸ› οΈ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio.

>[!TIP]
>πŸ”₯ The following 4-bit version is my personal go-to, delivering jaw-dropping performance on ARM cores.
>
>πŸ“ No need for file concatenation - just point llama-cli at the first file and watch the magic happen!
>
>πŸ’» Ready to delve in baby? Here's your command-line spell for interactive mode (prompt.txt is optional, but recommended for maximum sorcery):
>```bash
>./llama-cli --temp 0.4 -m deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf -c 32000 -co -cnv -i -f prompt.txt
>```

```verilog
deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf
deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf
deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf
deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf
```

>[!TIP]
>### πŸš„ Want to download faster than a caffeinated thirsty llama? Here's how:
>
>🐧 On Linux: `sudo apt install -y aria2`
>🍎 On Mac: `brew install aria2`
>
```bash
sudo apt install -y aria2
```

```bash
# πŸš€ For the BEST PERFORMANCE / SPEED QUANT THAT I PERSONALLY USE
aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00001-of-00004.gguf

aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00002-of-00004.gguf

aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00003-of-00004.gguf

aria2c -x 8 -o deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek_0628_cpu_optimized_iq4xm-00004-of-00004.gguf
```
```bash
# πŸ‹οΈ For the nearly lossless Q8_0 version
aria2c -x 8 -o deepseek-0628-q8_0-00001-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00001-of-00006.gguf

aria2c -x 8 -o deepseek-0628-q8_0-00002-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00002-of-00006.gguf

aria2c -x 8 -o deepseek-0628-q8_0-00003-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00003-of-00006.gguf

aria2c -x 8 -o deepseek-0628-q8_0-00004-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00004-of-00006.gguf

aria2c -x 8 -o deepseek-0628-q8_0-00005-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00005-of-00006.gguf

aria2c -x 8 -o deepseek-0628-q8_0-00006-of-00006.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-q8_0-00006-of-00006.gguf
```
```bash
# 🧠 For the full-brain BF16 version
aria2c -x 8 -o deepseek-0628-bf16-00001-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00001-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00002-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00002-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00003-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00003-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00004-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00004-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00005-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00005-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00006-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00006-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00007-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00007-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00008-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00008-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00009-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00009-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00010-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00010-of-00011.gguf

aria2c -x 8 -o deepseek-0628-bf16-00011-of-00011.gguf \
  https://huggingface.co/nisten/deepseek-0628-gguf/resolve/main/deepseek-0628-bf16-00011-of-00011.gguf
```

πŸ“œ The use of DeepSeek-V2-Chat-0628 model is subject to the [DeepSeek Model License](https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-MODEL). DeepSeek-V2 series supports commercial use. It's a permissive license that only restricts use for military purposes, harming minors, or patent trolling.

### 🌟 Model Information

DeepSeek-V2-Chat-0628 is the latest and greatest in the DeepSeek family. This AI powerhouse has climbed the LMSYS Chatbot Arena Leaderboard faster than a rocket on steroids:

- πŸ† Overall Arena Ranking: #11 global
- πŸ’» Coding Arena Ranking: #3, global
- 🧠 Hard Prompts Arena Ranking: #7 global, better than claude opus even in english only hard-prompts

Want to seek deeper into this model's ocean of awesomeness? Swim over to the [original model card](https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628) and prepare to have your mind blown! 🀯

Now go forth and accelerate πŸš€πŸ’‘

-Nisten