-
Instruction Following without Instruction Tuning
Paper • 2409.14254 • Published • 29 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 50 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 60 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2502.18600
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 26 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 13 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 43 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 22
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 147 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 55 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Paper • 2503.01743 • Published • 58 -
Phi-4 Technical Report
Paper • 2412.08905 • Published • 110 -
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 55 -
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 18
-
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Paper • 2502.19361 • Published • 24 -
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
Paper • 2502.17407 • Published • 24 -
Small Models Struggle to Learn from Strong Reasoners
Paper • 2502.12143 • Published • 28 -
Language Models can Self-Improve at State-Value Estimation for Better Search
Paper • 2503.02878 • Published • 7
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 106 -
Reasoning Language Models: A Blueprint
Paper • 2501.11223 • Published • 32 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 29 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 37
-
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
Paper • 2501.09751 • Published • 47 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 37 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 339 -
s1: Simple test-time scaling
Paper • 2501.19393 • Published • 108