1 138 600

Motoki Wu

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

liked a model about 15 hours ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

upvoted a paper about 19 hours ago

Expect the Unexpected: FailSafe Long Context QA for Finance

upvoted a paper 1 day ago

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

View all activity

Organizations

tokestermw's activity

upvoted a paper about 19 hours ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published 15 days ago • 124

upvoted 2 papers 1 day ago

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Paper • 2502.15027 • Published 5 days ago • 6

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Paper • 2502.14922 • Published 6 days ago • 28

upvoted a collection 1 day ago

Sky-T1-7B

Collection

A series of 7B models trained with different recipes and the corresponding training data. • 8 items • Updated 12 days ago • 5

upvoted a collection 6 days ago

Process Reward Models

Collection

Model and Datasets for Qwen 2.5 Math PRM 7B • 6 items • Updated 7 days ago • 1

upvoted a paper 8 days ago

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published 11 days ago • 30

upvoted a paper 12 days ago

Distillation Scaling Laws

Paper • 2502.08606 • Published 13 days ago • 45

upvoted 2 papers 15 days ago

Agency Is Frame-Dependent

Paper • 2502.04403 • Published 19 days ago • 21

ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning

Paper • 2502.04689 • Published 19 days ago • 7

upvoted an article 15 days ago

Article

Open R1: Update #2

and 6 others •

15 days ago

• 185

upvoted a paper 15 days ago

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published 19 days ago • 22

upvoted 2 papers 18 days ago

Scaling Embedding Layers in Language Models

Paper • 2502.01637 • Published 22 days ago • 22

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published 19 days ago • 23

upvoted an article 19 days ago

Article

Open-R1: Update #1

and 7 others •

24 days ago

• 287

upvoted a paper 20 days ago

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Paper • 2502.01142 • Published 23 days ago • 23

upvoted an article 20 days ago

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

22 days ago

• 50

upvoted 2 papers 25 days ago

GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published 26 days ago • 81

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 26 days ago • 55

upvoted an article 26 days ago

Article

How to deploy and fine-tune DeepSeek models on AWS

27 days ago

• 46

upvoted a paper 27 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 28 days ago • 107