Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,8 @@ library_name: transformers
|
|
10 |
|
11 |
> [!WARNING]
|
12 |
> **Sampling:**<br>
|
13 |
-
> Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section
|
|
|
14 |
|
15 |
**Original Model:**
|
16 |
[BeaverAI/mistral-dory-12b](https://huggingface.co/BeaverAI/mistral-dory-12b)
|
|
|
10 |
|
11 |
> [!WARNING]
|
12 |
> **Sampling:**<br>
|
13 |
+
> Mistral-Nemo-12B is very sensitive to the temperature sampler, try values near **0.3** at first or else you will get some weird results. This is mentioned by MistralAI at their [Transformers](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407#transformers) section. <br>
|
14 |
+
> Flash-Attention seems to have seem weird effects with the model as well, however there is no confirmation on this.
|
15 |
|
16 |
**Original Model:**
|
17 |
[BeaverAI/mistral-dory-12b](https://huggingface.co/BeaverAI/mistral-dory-12b)
|