Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Undi95 
posted an update 4 days ago
Post
4165
Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

What do you think about doing part of the dataset with replies from some context?

E.g. we have e.g. 50% of data with thinking from first user answer, and some parts of dataset with

User,
Bot (no thinking),
User
Bot (no thinking),
User, N times,
Then ask R1 to think here and train on it. So the model will understand long context better.

·

You could do that but in that case the bot will not use <think>because it's not trained on all of the reply to do it.

What I would ideally want is a model that apply the thinking itself without system prompt or prefilling

Which Discord is this? I would like to join.