1 86 17

Sun Donghae

NLPBada

https://blog.naver.com/gypsi12

DonghaeSuh

AI & ML interests

NLP

Recent Activity

upvoted a paper 12 days ago

Evolving Deeper LLM Thinking

upvoted a paper 17 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

upvoted a paper 23 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

View all activity

Organizations

None yet

NLPBada's activity

upvoted a paper 12 days ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 15 days ago • 103

upvoted a paper 17 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 17 days ago • 271

upvoted a paper 23 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 24 days ago • 249

upvoted 4 papers about 1 month ago

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 73

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 48

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Paper • 2412.15204 • Published Dec 19, 2024 • 33

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 344

updated a dataset 2 months ago

NLPBada/RAG_finance_QA_dataset

Viewer • Updated Nov 20, 2024 • 250 • 47

upvoted a paper 3 months ago

Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 63

updated a model 3 months ago

NLPBada/base_LoRA_512_512_bs6_step200

Text-to-Image • Updated Nov 13, 2024 • 6

upvoted 3 papers 3 months ago

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5, 2024 • 66

CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published Oct 30, 2024 • 54

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89

upvoted 2 papers 4 months ago

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Paper • 2410.12381 • Published Oct 16, 2024 • 43

WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents

Paper • 2410.07484 • Published Oct 9, 2024 • 48

liked a model 4 months ago

openai/whisper-large-v3-turbo

Automatic Speech Recognition • Updated Oct 4, 2024 • 2.89M • • 1.87k

upvoted 4 papers 4 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42

Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published Sep 20, 2024 • 68