dooho lee

BlueYellowGreen

https://leedooho.com

BlueYellowGreen

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

NeoBERT: A Next-Generation BERT

upvoted a paper 3 days ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

upvoted a paper 11 days ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

View all activity

Organizations

None yet

BlueYellowGreen's activity

upvoted 2 papers 3 days ago

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published 8 days ago • 38

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 4 days ago • 59

upvoted a paper 11 days ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published 15 days ago • 83

upvoted a paper 15 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 19 days ago • 139

upvoted a paper 18 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 22 days ago • 142

upvoted 2 papers 21 days ago

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published Feb 3 • 67

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published 24 days ago • 46

upvoted a paper 23 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 25 days ago • 142

upvoted 2 papers 28 days ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 112

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published about 1 month ago • 198

upvoted 2 papers about 1 month ago

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 185

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 109

liked a model about 2 months ago

CohereForAI/aya-expanse-8b

Text Generation • Updated 4 days ago • 32.3k • 344

liked a dataset 5 months ago

openai/MMMLU

Viewer • Updated Oct 16, 2024 • 393k • 22.8k • 469

liked 6 models 6 months ago