MarsupialAI
commited on
Commit
•
77234f9
1
Parent(s):
626925d
Update README.md
Browse files
README.md
CHANGED
@@ -11,16 +11,21 @@ tags:
|
|
11 |
# Kitchen Sink 103b
|
12 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/QFmPxADHAqMf3Wb_Xt1ry.jpeg)
|
13 |
|
14 |
-
This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b. The result of
|
|
|
|
|
|
|
15 |
|
16 |
Component models for the rotating stack are
|
17 |
- royallab/Aetheria-L2-70B
|
18 |
- lizpreciatior/lzlv_70b_fp16_hf
|
19 |
- Sao10K/WinterGoddess-1.4x-70B-L2
|
20 |
|
21 |
-
Components of those models are purported to include: Nous-Hermes-Llama2-70b, Xwin-LM-7B-V0.1, Mythospice-70b, Euryale-1.3-L2-70B,
|
|
|
22 |
|
23 |
-
This model is uncensored and perfectly capable of generating objectionable material. However, it is not an explicitely-NSFW model,
|
|
|
24 |
|
25 |
|
26 |
|
@@ -84,8 +89,12 @@ fluffy bunny from Fluffyville; he was Fuzzlesworth the Metal Bunny, and he wasn'
|
|
84 |
|
85 |
|
86 |
# Prompt format
|
87 |
-
Seems to have the strongest affinity for Alpaca prompts, but Vicuna works as well. Considering the variety of components, most
|
|
|
88 |
|
89 |
|
90 |
# WTF is a rotating-stack merge?
|
91 |
-
Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly
|
|
|
|
|
|
|
|
11 |
# Kitchen Sink 103b
|
12 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/QFmPxADHAqMf3Wb_Xt1ry.jpeg)
|
13 |
|
14 |
+
This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b. The result of
|
15 |
+
this "frankenmerge" is a large model that contains a little bit of everything - including the kitchen sink. RP, chat, storywriting,
|
16 |
+
and instruct are all well supported. It may or may not code well - I lack the expertise to test it in that capacity, but considering
|
17 |
+
the source models, it is unlikely.
|
18 |
|
19 |
Component models for the rotating stack are
|
20 |
- royallab/Aetheria-L2-70B
|
21 |
- lizpreciatior/lzlv_70b_fp16_hf
|
22 |
- Sao10K/WinterGoddess-1.4x-70B-L2
|
23 |
|
24 |
+
Components of those models are purported to include: Nous-Hermes-Llama2-70b, Xwin-LM-7B-V0.1, Mythospice-70b, Euryale-1.3-L2-70B,
|
25 |
+
tulu-2-dpo-70b, GOAT-70B-Storytelling, Platypus2-70B-instruct, Lila-70B, SunsetBoulevard, and some private LoRAs.
|
26 |
|
27 |
+
This model is uncensored and perfectly capable of generating objectionable material. However, it is not an explicitely-NSFW model,
|
28 |
+
and it has never "gone rogue" and tried to insert NSFW content into SFW prompts in my experience.
|
29 |
|
30 |
|
31 |
|
|
|
89 |
|
90 |
|
91 |
# Prompt format
|
92 |
+
Seems to have the strongest affinity for Alpaca prompts, but Vicuna works as well. Considering the variety of components, most
|
93 |
+
formats will probbaly work to some extent.
|
94 |
|
95 |
|
96 |
# WTF is a rotating-stack merge?
|
97 |
+
Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly
|
98 |
+
improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks. That is
|
99 |
+
what I did here. I created three passthrough stacked merges using the three source models (rotating the model order in each stack),
|
100 |
+
and then doing a linear merge of all three stacks. The exact merge configs can be found in the recipe.txt file.
|