@Undi95 on Hugging Face: "Hi there! If you want to create your own thinking model or do a better…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

Undi95

posted an update 4 days ago

Post

4165

Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

Ainonake

4 days ago

•

edited 4 days ago

What do you think about doing part of the dataset with replies from some context?

E.g. we have e.g. 50% of data with thinking from first user answer, and some parts of dataset with

User,
Bot (no thinking),
User
Bot (no thinking),
User, N times,
Then ask R1 to think here and train on it. So the model will understand long context better.

Undi95

4 days ago

•

edited 4 days ago

You could do that but in that case the bot will not use <think>because it's not trained on all of the reply to do it.

What I would ideally want is a model that apply the thinking itself without system prompt or prefilling

p-e-r-e-g-r-i-n-e

2 days ago

Which Discord is this? I would like to join.

In this post