starble-dev commited on
Commit
f5108d8
·
verified ·
1 Parent(s): 0a2a391

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -10,7 +10,8 @@ library_name: transformers
10
 
11
  > [!WARNING]
12
  > **Sampling:**<br>
13
- > Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section
 
14
 
15
  **Original Model:**
16
  [BeaverAI/mistral-dory-12b](https://huggingface.co/BeaverAI/mistral-dory-12b)
 
10
 
11
  > [!WARNING]
12
  > **Sampling:**<br>
13
+ > Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section. <br>
14
+ > Flash-Attention seems to have seem weird effects with the model as well, however there is no confirmation on this.
15
 
16
  **Original Model:**
17
  [BeaverAI/mistral-dory-12b](https://huggingface.co/BeaverAI/mistral-dory-12b)