brucethemoose
commited on
Commit
•
98dfcdb
1
Parent(s):
0e28c29
Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ It might recognize ChatML, or maybe Llama-chat from Airoboros.
|
|
28 |
Sometimes the model "spells out" the stop token as `</s>` like Capybara, so you may need to add `</s>` as an additional stopping condition.
|
29 |
***
|
30 |
## Running
|
31 |
-
Being a Yi model, try running a lower temperature with 0.05-0.1 MinP, a little repetition penalty, and no other samplers. Yi tends to run "hot" by default.
|
32 |
|
33 |
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/)
|
34 |
|
|
|
28 |
Sometimes the model "spells out" the stop token as `</s>` like Capybara, so you may need to add `</s>` as an additional stopping condition.
|
29 |
***
|
30 |
## Running
|
31 |
+
Being a Yi model, try running a lower temperature with 0.05-0.1 MinP, a little repetition penalty, and no other samplers. Yi tends to run "hot" by default, and it really needs MinP to cull the huge vocabulary.
|
32 |
|
33 |
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/)
|
34 |
|