wang
wangxbx
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 15 hours ago
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
upvoted
a
paper
about 15 hours ago
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative
Textual Feedback
upvoted
a
paper
about 15 hours ago
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Organizations
None yet
models
None public yet
datasets
None public yet