yfliao
/

Qwen-2.5-1.5B-Simple-RL

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yfliao commited on 13 days ago

Commit

5be35f9

·

verified ·

1 Parent(s): 8b2cf35

Training in progress, epoch 1

Files changed (3) hide show

README.md +4 -5
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,17 +1,16 @@
 ---
 base_model: Qwen/Qwen2.5-Math-1.5B
 library_name: transformers
-model_name: Qwen-2.5-1.5B-Simple-RL
 tags:
 - generated_from_trainer
-- trl
-- grpo
 licence: license
 ---
-# Model Card for Qwen-2.5-1.5B-Simple-RL
-This model is a fine-tuned version of [Qwen/Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

 ---
 base_model: Qwen/Qwen2.5-Math-1.5B
+datasets: DigitalLearningGmbH/MATH-lighteval
 library_name: transformers
 tags:
 - generated_from_trainer
+- open-r1
 licence: license
 ---
+# Model Card for None
+This model is a fine-tuned version of [Qwen/Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B) on the [DigitalLearningGmbH/MATH-lighteval](https://huggingface.co/datasets/DigitalLearningGmbH/MATH-lighteval) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a552b7c22d1518545d432ceb5309a10041adbfb98eab6609ac68f06cb039f854
 size 3554214752

 version https://git-lfs.github.com/spec/v1
+oid sha256:4a01c896242d61fe45bb1afbbff9740bb7e99465651b79fd084b3d7397172f41
 size 3554214752

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7aff461dd40134c7f432c4d6150d3e85c0aa6bb7bf3bf80175b7df8d9c46b7ba
 size 7672

 version https://git-lfs.github.com/spec/v1
+oid sha256:bf2c0994671a500a1bc1dd8bf35aade41f6ddaa6555273ddd8276bd553d63963
 size 7672