Statuo commited on
Commit
46e973e
·
verified ·
1 Parent(s): b3d71a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -6,8 +6,26 @@ library_name: transformers
6
  tags:
7
  - mergekit
8
  - merge
9
-
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  # merge
12
 
13
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -41,4 +59,4 @@ models:
41
  merge_method: linear
42
  dtype: float16
43
 
44
- ```
 
6
  tags:
7
  - mergekit
8
  - merge
9
+ license: apache-2.0
10
  ---
11
+
12
+ ![DeepseekerKunou](https://files.catbox.moe/n5ejwr.png)
13
+
14
+ Been a while since I've seen you with a merge of my own, eh? This is released under the Qwen License which is part of the files in this quant and the main model.
15
+ <br>
16
+ Anyway, with the release of Deepseek R Distill, I gave it a poke and found what I expected. Not great for creative writing tasks but seems otherwise intelligent. So I decided to take a stab at another merge much like I did with LemonKunoichiWizard. What you're seeing here is the result of - give or take - four separate merges. Out of the merges, I feel this one is the best out of the four. That all being said, I think it increased the base intelligence of Kunou marginally which was nice to see. That being said, it's a 14b and I could be chugging a placebo so take that with a grain of salt. Thanks again to Sao10k for the finetune and the Deepseek team for releasing it under an open license. Hopefully you all enjoy.
17
+ <br>
18
+ I tested using SLERP as well, but the SLERP version was noticeably stupider. Only other thing of note is that it still has that Qwen tendency to occasionally just spam output the EoS tag sometimes.
19
+ <br>
20
+
21
+ <br>
22
+ [This is the EXL2 8bpw version of this model. For the original model, go here](https://huggingface.co/Statuo/Deepseeker-Kunou)
23
+ <br>
24
+ [For the 6bpw version, go here](https://huggingface.co/Statuo/Deepseeker-Kunou-EXL2-6bpw)
25
+ <br>
26
+ [For the 4bpw version, go here](https://huggingface.co/Statuo/Deepseeker-Kunou-EXL2-4bpw)
27
+ <br>
28
+
29
  # merge
30
 
31
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
59
  merge_method: linear
60
  dtype: float16
61
 
62
+ ```