qwen2.5_3B_grpo_v1 / model-00002-of-00002.safetensors

Commit History

Trained with Unsloth
22f7dfd
verified

alan918727 commited on