Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
riddickz
/
Qwen2.5-1.5B-Open-R1-GRPO
like
0
Text Generation
Transformers
Safetensors
qwen2
Generated from Trainer
trl
grpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Qwen2.5-1.5B-Open-R1-GRPO
/
train_results.json
Commit History
Model save
e2e5b66
verified
riddickz
commited on
about 23 hours ago
Model save
d7e6c6f
verified
riddickz
commited on
about 23 hours ago
Model save
47028fb
verified
riddickz
commited on
1 day ago
Model save
6ddfb3e
verified
riddickz
commited on
2 days ago