System Prompt
We have tested the system prompt with temperature of 0.7.
You are a helpful and harmless assistant. You should think step-by-step.
Here are the evaluation results.
Models | AIME24 | MATH500 | GSM8K | GPQA-Diamond | ARC-Challenge | MMLU-Pro | MMLU | LiveCodeBench |
---|---|---|---|---|---|---|---|---|
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 46.67 | 88.20 | - | 57.58 | - | - | - | - |
More evaluation results can be found at https://huggingface.co/FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview
Can you confirm that this model can't even answer this easy question? Or is there something wrong with my config...
"There are three killers in a room. Someone enters the room and kills one of them. Nobody leaves the room. How many killers are left in the room?"
Answer should be 3 obviously.
Also it fails at some of my other test questions, that phi-4 answers.
(using latest llama.cpp build 4518 with built-in template)
Other examples:
Tell me the name of a country whose name ends with 'lia'. Give me the capital city of that country as well.
After thinking for a while it answered: There is no country whose official English name ends with "lia."
So something is not right here...
Can you confirm that this model can't even answer this easy question? Or is there something wrong with my config...
"There are three killers in a room. Someone enters the room and kills one of them. Nobody leaves the room. How many killers are left in the room?"
Answer should be 3 obviously.Also it fails at some of my other test questions, that phi-4 answers.
(using latest llama.cpp build 4518 with built-in template)Other examples:
Tell me the name of a country whose name ends with 'lia'. Give me the capital city of that country as well.
After thinking for a while it answered: There is no country whose official English name ends with "lia."So something is not right here...
There are three killers in a room. Someone enters the room and kills one of them. Nobody leaves the room. How many killers are left in the room?
Okay, so I'm trying to figure out this riddle: There are three killers in a room. Someone enters the room and kills one of them. Nobody leaves the room. How many killers are left in the room?
Hmm, let's break it down step by step. First, the initial setup is three killers in the room. So, at the beginning, we have three individuals who are all killers. Now, someone enters the room. Let me think—this person must be the one who does the killing because the riddle says "someone enters the room and kills one of them." So, this new person is a killer as well, right? Or is there a chance that this person is innocent? The riddle doesn't specify, so maybe I should consider both possibilities.
If the person who enters is innocent, they don't kill anyone, but according to the riddle, they do kill one of them. So, the person entering must be a killer. Therefore, we have four killers in total now, right? But wait, no—let's see. The initial three are killers, and the person entering also kills one, making it four. But wait, the person who enters kills one of the existing killers. So, after the killing, how many killers are left?
Wait, maybe I'm overcomplicating it. Let me think again. Initially, three killers: K1, K2, K3. Someone enters, let's call them K4. K4 is a killer because they killed one of the existing killers. So, K4 is a killer, and now K4 is in the room. But one of the original three killers is killed. So, one of K1, K2, K3 is now dead. Therefore, the number of killers left would be the remaining two original killers plus K4, right?
Wait, no—that doesn't make sense. Because if K4 is a killer, but he's still in the room, so he's alive, correct? The riddle says someone enters and kills one of them, but it doesn't say if the killer leaves. The problem states, "nobody leaves the room." So, the person who entered is still there, and the three original killers minus one who was killed. So, that would be two original killers and the new killer, making it three killers left.
But wait, hold on. If K4 is a killer, and he's still in the room, and he killed one of the original three, that means K4 is alive, and one original killer is dead. So, the number of killers left is the other two original killers plus K4. So, three killers in total.
Wait, but let me think again. If the person who entered is a killer, and they kill one of the existing killers, that person is dead, so they are no longer a killer. The other two original killers are still alive, so they are still killers. Plus, the person who entered is also a killer. So, that's three killers remaining.
But here's another angle: if the person who entered kills one of the existing killers, does that make the person who entered a killer? Or is the person who entered a non-killer? The riddle doesn't specify. So, maybe the person who enters is a non-killer, but they kill someone. But then, how can a non-killer be the one who kills someone? That seems contradictory.
Wait, maybe the riddle is playing with the word "killer." So, initially, there are three killers. Then someone enters, who is not a killer, and kills one of the killers. But that would mean that the non-killer is now a killer because they committed a murder. So, the three original killers minus one (who was killed) plus the new killer (the non-killer who killed someone) would make three killers again.
But this is getting confusing. Let me try to structure it.
Case 1: The person who enters is a killer.
- Original killers: 3
- Person enters and kills one: so one original killer dies, leaving 2
- The new killer is still in the room, so total killers: 2 + 1 = 3
Case 2: The person who enters is not a killer.
- Original killers: 3
- Person enters, kills one: so one original killer dies, leaving 2
- The person who entered is now a killer because they killed someone
- So total killers: 2 + 1 = 3
So regardless of whether the person who entered was originally a killer or not, the result is the same: three killers left in the room.
Wait, but in the first case, the person entering is a killer, so they are already counted as a killer before killing someone. In the second case, the person entering is not a killer but becomes one after killing someone. So, in both cases, the total number of killers remains three.
Therefore, the answer is three.
The answer to the riddle is three killers remaining in the room. Here's the breakdown:
- Initial Setup: There are three killers in the room.
- Someone Enters: A new person enters the room and kills one of the existing killers. This person becomes a killer as well.
- No One Leaves: All individuals remain in the room after the incident.
- Counting Killers:
- Original killers: 3
- One original killer is killed, leaving 2.
- The new person becomes a killer, adding 1.
- Total Killers: 2 (remaining original) + 1 (new killer) = 3.
Thus, there are three killers left in the room. [end of text]
Can you confirm that this model can't even answer this easy question? Or is there something wrong with my config...
"There are three killers in a room. Someone enters the room and kills one of them. Nobody leaves the room. How many killers are left in the room?"
Answer should be 3 obviously.Also it fails at some of my other test questions, that phi-4 answers.
(using latest llama.cpp build 4518 with built-in template)Other examples:
Tell me the name of a country whose name ends with 'lia'. Give me the capital city of that country as well.
After thinking for a while it answered: There is no country whose official English name ends with "lia."So something is not right here...
As for the conutry name, it failed.
I'm using Q4_K_M, so it might be better with the bf16 model.
We have tested the system prompt with temperature of 0.7.
You are a helpful and harmless assistant. You should think step-by-step.
Here are the evaluation results.
Models AIME24 MATH500 GSM8K GPQA-Diamond ARC-Challenge MMLU-Pro MMLU LiveCodeBench deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 46.67 88.20 - 57.58 - - - - More evaluation results can be found at https://huggingface.co/FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview
The evaluation code is modified from SkyThought. In our evaluation, we set the temperature to 0.7 and the max_tokens to 32768. We provide the example to reproduce our results in evaluation.
The system prompt for evaluation is set to:
You are a helpful and harmless assistant. You should think step-by-step.
We are currently attempting to reproduce the results reported in the DeepSeek-R1 paper by experimenting with different system prompts. We will update our findings once we have acquired the original system prompt used in their study.
The updated evaluation results are presented here:
Models | AIME24 | MATH500 | GSM8K | GPQA-Diamond | ARC-Challenge | MMLU-Pro | MMLU | LiveCodeBench |
---|---|---|---|---|---|---|---|---|
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 46.67 | 88.20 | 93.71 | 57.58 | 95.90 | 68.70 | 82.17 | 59.69 |
More evaluation results can be found at https://huggingface.co/FuseAI/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview
After first try it answered wrong "2 killers", tried again, "3 killers". But still i don't see how this is better than phi-4, which doesn't need to think. Maybe there are specific questions where Phi-4 might fail but DeepSeek won’t—I’m not sure. If you got some of those, let me know :)