
qwen2.5-3b-or1-tensopolis
This model is a reasoning fine-tune of unsloth/Qwen2.5-3B-Instruct. Trained in 1xA100 for about 50 hours. Please refer to the base model and dataset for more information about license, prompt format, etc.
Base model: Qwen/Qwen2.5-3B-Instruct
Dataset: open-r1/OpenR1-Math-220k
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 34
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.