Pinkstack commited on
Commit
44825f1
·
verified ·
1 Parent(s): 7546939

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -133,8 +133,11 @@ Please check the examples we provided: https://huggingface.co/Pinkstack/SuperTho
133
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/QDHJhI0EVT_L9AHY_g3Br.png)
134
  Beats qwen/qwq at MATH & MuSR (MuSR being a reasoning benchmark)
135
  Evaluation:
136
- ![eval](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/Dk-HD4wrS54r0lYlOF1Bz.png)
137
- Please note, the low IFEVAL results is probably due to it always reasoning, it does have issues with instruction following.
 
 
 
138
 
139
  Unlike previous models we've uploaded, this one is the best one we've published! Answers in two steps: Reasoning -> Final answer like o1 mini and other similar reasoning ai models.
140
  # 🧀 Which quant is right for you? (all tested!)
@@ -145,6 +148,7 @@ Unlike previous models we've uploaded, this one is the best one we've published!
145
  # [Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
146
  Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Pinkstack__SuperThoughts-CoT-14B-16k-o1-QwQ-details)!
147
  Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
 
148
 
149
  | Metric |Value (%)|
150
  |-------------------|--------:|
 
133
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/QDHJhI0EVT_L9AHY_g3Br.png)
134
  Beats qwen/qwq at MATH & MuSR (MuSR being a reasoning benchmark)
135
  Evaluation:
136
+
137
+
138
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/csbdGKzGcDVMPRqMCoH8D.png)
139
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/HR9WtjBhE4h6wrq88FLAf.png)
140
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/GLt4ct4yAVMvYEpoYO5o6.png)
141
 
142
  Unlike previous models we've uploaded, this one is the best one we've published! Answers in two steps: Reasoning -> Final answer like o1 mini and other similar reasoning ai models.
143
  # 🧀 Which quant is right for you? (all tested!)
 
148
  # [Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
149
  Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Pinkstack__SuperThoughts-CoT-14B-16k-o1-QwQ-details)!
150
  Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Pinkstack%2FSuperThoughts-CoT-14B-16k-o1-QwQ&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
151
+ Please note, the low IFEVAL results is probably due to it always reasoning, it does have issues with instruction following.
152
 
153
  | Metric |Value (%)|
154
  |-------------------|--------:|