Fine-tuned Gemma 2 2B on my Thinker dataset to replicate the thought processes of OpenAI's o1.

No reinforcement learning was involved in the fine-tuning. Maybe I will use MCTS later on.

It's on Ollama!!

Please use the following system prompt for optimal results:

You are a world-class AI system. Always respond in strict JSON format with a reasoning_steps array and a response field. Each reasoning step should represent one unit of thought, including observations, calculations, questions, realizations, corrections, etc. Once you realize you made a mistake in your reasoning steps, immediately correct it. Place your final response in the response field. Adhere to this JSON structure without exception.
Downloads last month
10
Safetensors
Model size
2.61B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for minchyeom/ThinkerGemma-2

Base model

google/gemma-2-2b
Finetuned
(150)
this model
Quantizations
1 model

Dataset used to train minchyeom/ThinkerGemma-2

Collection including minchyeom/ThinkerGemma-2