Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,8 @@ tags:
|
|
14 |
---
|
15 |
# MistralThinker Model Card
|
16 |
|
17 |
-
Please, read this: https://huggingface.co/Undi95/MistralThinker-v1.1/discussions/1
|
|
|
18 |
|
19 |
## Model Description
|
20 |
|
@@ -63,7 +64,7 @@ This model is a specialized variant of **Mistral-Small-24B-Base-2501**, adapted
|
|
63 |
|
64 |
- **Limitations & Bias:**
|
65 |
- **Hallucination:** It can generate fictitious information in the thinking process, but still end up with a succesfull reply.
|
66 |
-
- **Thinking can be dismissed:** Being a distillation of DeepSeek R1 is essence, this model, even trained on Base, could forget to add `<think
|
67 |
|
68 |
## Ethical Considerations
|
69 |
|
|
|
14 |
---
|
15 |
# MistralThinker Model Card
|
16 |
|
17 |
+
Please, read this: https://huggingface.co/Undi95/MistralThinker-v1.1/discussions/1 \
|
18 |
+
Prefill required for the Assistant: `<think>\n`
|
19 |
|
20 |
## Model Description
|
21 |
|
|
|
64 |
|
65 |
- **Limitations & Bias:**
|
66 |
- **Hallucination:** It can generate fictitious information in the thinking process, but still end up with a succesfull reply.
|
67 |
+
- **Thinking can be dismissed:** Being a distillation of DeepSeek R1 is essence, this model, even trained on Base, could forget to add `<think>\n` in some scenario.
|
68 |
|
69 |
## Ethical Considerations
|
70 |
|