HiroseKoichi
commited on
Commit
•
658c387
1
Parent(s):
91fb285
Update README.md
Browse files
README.md
CHANGED
@@ -19,11 +19,20 @@ One of the main reasons I started model merging was to create a model that's goo
|
|
19 |
|
20 |
Now that I think about it, is this really emergent behavior? It seems pretty obvious in hindsight that a model that's not trying to shove positivity up your ass at every turn would be more willing to generate "offensive" and realistic content.
|
21 |
|
|
|
|
|
22 |
# Merging Tips
|
23 |
If I were to write a paper on model merging, it would be called "Model Stock Is All You Need" because it's seriously amazing. I've tried many different merge methods, and I could only get barely passable results after tweaking parameters all day, but Model Stock has consistently produced good models for me. I recently made a discovery, though in hindsight it's very obvious, but model order matters a lot when using Model Stock, and it can make or break a merge. I have found that models at the top of the list integrate more deeply into the model, and models at the bottom of the list keep more of their style in the final result. What this means is that you should put chaotic models and ones that add new capabilities at the top of the list and the more balanced and coherent ones at the bottom.
|
24 |
|
25 |
The secret to absolutely hammering out positivity bias is to use MopeyMule as the base model and put an uncensored model at the top of the list (my favorite is LLAMA-3_8B_Unaligned_Alpha). Of course, if you add models that have a strong bias towards positivity to the merge, then it will likely reduce or even nullify the effect.
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
# Details
|
28 |
- **License**: [llama3](https://llama.meta.com/llama3/license/)
|
29 |
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
|
|
|
19 |
|
20 |
Now that I think about it, is this really emergent behavior? It seems pretty obvious in hindsight that a model that's not trying to shove positivity up your ass at every turn would be more willing to generate "offensive" and realistic content.
|
21 |
|
22 |
+
Note: 2.0 seems to have more repitition than the original. I'll try to fix that in future versions.
|
23 |
+
|
24 |
# Merging Tips
|
25 |
If I were to write a paper on model merging, it would be called "Model Stock Is All You Need" because it's seriously amazing. I've tried many different merge methods, and I could only get barely passable results after tweaking parameters all day, but Model Stock has consistently produced good models for me. I recently made a discovery, though in hindsight it's very obvious, but model order matters a lot when using Model Stock, and it can make or break a merge. I have found that models at the top of the list integrate more deeply into the model, and models at the bottom of the list keep more of their style in the final result. What this means is that you should put chaotic models and ones that add new capabilities at the top of the list and the more balanced and coherent ones at the bottom.
|
26 |
|
27 |
The secret to absolutely hammering out positivity bias is to use MopeyMule as the base model and put an uncensored model at the top of the list (my favorite is LLAMA-3_8B_Unaligned_Alpha). Of course, if you add models that have a strong bias towards positivity to the merge, then it will likely reduce or even nullify the effect.
|
28 |
|
29 |
+
# Quantization Formats
|
30 |
+
**GGUF**
|
31 |
+
- Static:
|
32 |
+
- https://huggingface.co/mradermacher/Llama-3-8B-Stroganoff-2.0-GGUF
|
33 |
+
- Imatrix:
|
34 |
+
- https://huggingface.co/mradermacher/Llama-3-8B-Stroganoff-2.0-i1-GGUF
|
35 |
+
|
36 |
# Details
|
37 |
- **License**: [llama3](https://llama.meta.com/llama3/license/)
|
38 |
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
|