TheDrummer
commited on
Commit
•
7356d0b
1
Parent(s):
6bc9a64
Update README.md
Browse files
README.md
CHANGED
@@ -49,6 +49,15 @@ Dik nodded, but didn't say anything.<br/>
|
|
49 |
Moistral has been trained with many long form texts, a nice chunk of which are 8K in length.
|
50 |
It is capable of going far and long without passing it back to you. This is not your typical chibi RP model.
|
51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
## What's next?
|
53 |
Moistral 11B is my first attempt at finetuning a capable model (Sorry, CreamPhi-2).
|
54 |
It's coherent and creative enough to let me understand the impact of my dataset & training.
|
|
|
49 |
Moistral has been trained with many long form texts, a nice chunk of which are 8K in length.
|
50 |
It is capable of going far and long without passing it back to you. This is not your typical chibi RP model.
|
51 |
|
52 |
+
### Parameters
|
53 |
+
If Moistral starts to underperform and token spit, I've noticed that lowering the parameters makes it coherent again. Here's what worked for me:
|
54 |
+
```yaml
|
55 |
+
temperature: 0.66
|
56 |
+
repetition_penalty: 1.1
|
57 |
+
top_p: 0.64
|
58 |
+
```
|
59 |
+
I encourage you to play around with the parameters yourself to see what works for you.
|
60 |
+
|
61 |
## What's next?
|
62 |
Moistral 11B is my first attempt at finetuning a capable model (Sorry, CreamPhi-2).
|
63 |
It's coherent and creative enough to let me understand the impact of my dataset & training.
|