Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,9 @@
|
|
2 |
base_model: [deepseek-ai/DeepSeek-V2-Chat-0628]
|
3 |
---
|
4 |
|
5 |
-
#### 🚀 Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference! 🖥️
|
|
|
|
|
6 |
|
7 |
### 🧠 This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs.
|
8 |
### 🛠️ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio.
|
|
|
2 |
base_model: [deepseek-ai/DeepSeek-V2-Chat-0628]
|
3 |
---
|
4 |
|
5 |
+
#### 🚀 Custom quantizations of DeepSeek-V2-Chat-0628 supercharged for CPU inference of currently the #7 model globally on lmsys arena hard! 🖥️
|
6 |
+
|
7 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/rbdug3j6BaeTSmKLDIp39.png)
|
8 |
|
9 |
### 🧠 This IQ4XM version uses GGML TYPE IQ_4_XS 4bit in combination with q8_0 bit for blazing fast performance with minimal loss, leveraging int8 optimizations on most newer server CPUs.
|
10 |
### 🛠️ While it required some custom code wizardry, it's fully compatible with standard llama.cpp from GitHub or just search for nisten in lmstudio.
|