TheDrummer commited on
Commit
7356d0b
1 Parent(s): 6bc9a64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -49,6 +49,15 @@ Dik nodded, but didn't say anything.<br/>
49
  Moistral has been trained with many long form texts, a nice chunk of which are 8K in length.
50
  It is capable of going far and long without passing it back to you. This is not your typical chibi RP model.
51
 
 
 
 
 
 
 
 
 
 
52
  ## What's next?
53
  Moistral 11B is my first attempt at finetuning a capable model (Sorry, CreamPhi-2).
54
  It's coherent and creative enough to let me understand the impact of my dataset & training.
 
49
  Moistral has been trained with many long form texts, a nice chunk of which are 8K in length.
50
  It is capable of going far and long without passing it back to you. This is not your typical chibi RP model.
51
 
52
+ ### Parameters
53
+ If Moistral starts to underperform and token spit, I've noticed that lowering the parameters makes it coherent again. Here's what worked for me:
54
+ ```yaml
55
+ temperature: 0.66
56
+ repetition_penalty: 1.1
57
+ top_p: 0.64
58
+ ```
59
+ I encourage you to play around with the parameters yourself to see what works for you.
60
+
61
  ## What's next?
62
  Moistral 11B is my first attempt at finetuning a capable model (Sorry, CreamPhi-2).
63
  It's coherent and creative enough to let me understand the impact of my dataset & training.