HiroseKoichi commited on
Commit
658c387
1 Parent(s): 91fb285

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -19,11 +19,20 @@ One of the main reasons I started model merging was to create a model that's goo
19
 
20
  Now that I think about it, is this really emergent behavior? It seems pretty obvious in hindsight that a model that's not trying to shove positivity up your ass at every turn would be more willing to generate "offensive" and realistic content.
21
 
 
 
22
  # Merging Tips
23
  If I were to write a paper on model merging, it would be called "Model Stock Is All You Need" because it's seriously amazing. I've tried many different merge methods, and I could only get barely passable results after tweaking parameters all day, but Model Stock has consistently produced good models for me. I recently made a discovery, though in hindsight it's very obvious, but model order matters a lot when using Model Stock, and it can make or break a merge. I have found that models at the top of the list integrate more deeply into the model, and models at the bottom of the list keep more of their style in the final result. What this means is that you should put chaotic models and ones that add new capabilities at the top of the list and the more balanced and coherent ones at the bottom.
24
 
25
  The secret to absolutely hammering out positivity bias is to use MopeyMule as the base model and put an uncensored model at the top of the list (my favorite is LLAMA-3_8B_Unaligned_Alpha). Of course, if you add models that have a strong bias towards positivity to the merge, then it will likely reduce or even nullify the effect.
26
 
 
 
 
 
 
 
 
27
  # Details
28
  - **License**: [llama3](https://llama.meta.com/llama3/license/)
29
  - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
 
19
 
20
  Now that I think about it, is this really emergent behavior? It seems pretty obvious in hindsight that a model that's not trying to shove positivity up your ass at every turn would be more willing to generate "offensive" and realistic content.
21
 
22
+ Note: 2.0 seems to have more repitition than the original. I'll try to fix that in future versions.
23
+
24
  # Merging Tips
25
  If I were to write a paper on model merging, it would be called "Model Stock Is All You Need" because it's seriously amazing. I've tried many different merge methods, and I could only get barely passable results after tweaking parameters all day, but Model Stock has consistently produced good models for me. I recently made a discovery, though in hindsight it's very obvious, but model order matters a lot when using Model Stock, and it can make or break a merge. I have found that models at the top of the list integrate more deeply into the model, and models at the bottom of the list keep more of their style in the final result. What this means is that you should put chaotic models and ones that add new capabilities at the top of the list and the more balanced and coherent ones at the bottom.
26
 
27
  The secret to absolutely hammering out positivity bias is to use MopeyMule as the base model and put an uncensored model at the top of the list (my favorite is LLAMA-3_8B_Unaligned_Alpha). Of course, if you add models that have a strong bias towards positivity to the merge, then it will likely reduce or even nullify the effect.
28
 
29
+ # Quantization Formats
30
+ **GGUF**
31
+ - Static:
32
+ - https://huggingface.co/mradermacher/Llama-3-8B-Stroganoff-2.0-GGUF
33
+ - Imatrix:
34
+ - https://huggingface.co/mradermacher/Llama-3-8B-Stroganoff-2.0-i1-GGUF
35
+
36
  # Details
37
  - **License**: [llama3](https://llama.meta.com/llama3/license/)
38
  - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)