siyeng feng's picture

322 180

siyeng feng

siyengfeng

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 6 hours ago

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

upvoted a paper about 6 hours ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

upvoted a paper about 6 hours ago

s1: Simple test-time scaling

View all activity

Organizations

None yet

siyengfeng's activity

upvoted 3 papers about 6 hours ago

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Paper • 2411.04983 • Published Nov 7, 2024 • 6

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 3 days ago • 23

s1: Simple test-time scaling

Paper • 2501.19393 • Published 3 days ago • 44

upvoted 4 papers 3 days ago

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Paper • 2501.18511 • Published 4 days ago • 15

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation

Paper • 2501.16609 • Published 7 days ago • 5

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

Paper • 2501.16411 • Published 7 days ago • 17

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 4 days ago • 39

upvoted a paper 4 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 6 days ago • 88

upvoted a paper 6 days ago

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 7 days ago • 23

upvoted 3 papers 11 days ago

Autonomy-of-Experts Models

Paper • 2501.13074 • Published 12 days ago • 40

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 13 days ago • 83

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 284

upvoted 2 papers 12 days ago

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published 13 days ago • 39

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published 15 days ago • 31

upvoted an article 12 days ago

Article

Process Reinforcement through Implicit Rewards

By

•

Jan 3

• 20

upvoted 2 papers 12 days ago

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Paper • 2501.10893 • Published 16 days ago • 23

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 14 days ago • 89

upvoted 3 papers 14 days ago

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 33

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 18 days ago • 104

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 79