Kai Zuberbühler

kaizuberbuehler

k-zubi

AI & ML interests

language models, agents, image generation, music generation

Recent Activity

liked a model about 19 hours ago

adlb/Audialab_EDM_Elements

updated a collection 3 days ago

Leaderboards

liked a Space 3 days ago

ServiceNow/browsergym-leaderboard

View all activity

Organizations

None yet

kaizuberbuehler's activity

upvoted 2 papers 3 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 5 days ago • 58

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 5 days ago • 204

upvoted a paper 4 days ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published 6 days ago • 45

upvoted 3 papers 5 days ago

PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published 10 days ago • 37

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning

Paper • 2411.03817 • Published Nov 6, 2024 • 1

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Paper • 2406.11896 • Published Jun 14, 2024 • 20

upvoted a collection 6 days ago

DeepSeek-R1

Collection

8 items • Updated 6 days ago • 165

upvoted 2 papers 8 days ago

FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published 10 days ago • 22

Do generative video models learn physical principles from watching videos?

Paper • 2501.09038 • Published 12 days ago • 30

upvoted 5 papers 9 days ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published 11 days ago • 35

upvoted a paper 10 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 10 days ago • 65

upvoted 5 papers 11 days ago

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Paper • 2501.05040 • Published 18 days ago • 15

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

Paper • 2501.04003 • Published 19 days ago • 24

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published 18 days ago • 49

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Paper • 2501.05707 • Published 17 days ago • 19

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published 17 days ago • 38