4 101 37

Daniel Huynh PRO

dhuynh95

dhuynh95

AI & ML interests

None yet

Recent Activity

updated a collection about 15 hours ago

cool-papers

upvoted a paper about 15 hours ago

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

updated a collection about 15 hours ago

cool-papers

View all activity

Organizations

dhuynh95's activity

upvoted 2 papers about 15 hours ago

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

Paper • 2502.14302 • Published 6 days ago • 8

VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Paper • 2502.12084 • Published 8 days ago • 27

upvoted 10 papers 4 days ago

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Paper • 2502.08235 • Published 13 days ago • 53

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 5 days ago • 161

Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking

Paper • 2502.09083 • Published 13 days ago • 4

Intuitive physics understanding emerges from self-supervised pretraining on natural videos

Paper • 2502.11831 • Published 8 days ago • 13

Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation

Paper • 2502.08826 • Published 13 days ago • 16

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Paper • 2502.11196 • Published 9 days ago • 21

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Paper • 2502.12215 • Published 9 days ago • 15

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 8 days ago • 26

Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering

Paper • 2502.13962 • Published 6 days ago • 27

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Paper • 2502.14802 • Published 5 days ago • 10

upvoted 3 papers 10 days ago

upvoted 5 papers 11 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 14 days ago • 28

Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon

Paper • 2502.07445 • Published 14 days ago • 11

Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models

Paper • 2502.06755 • Published 15 days ago • 7

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 15 days ago • 136

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published 19 days ago • 30