Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 1 day ago • 16
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published 5 days ago • 80
Indiana Jones: There Are Always Some Useful Ancient Relics Paper • 2501.18628 • Published 29 days ago • 1
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published 7 days ago • 23
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Paper • 2502.02481 • Published 21 days ago • 8
GemmaX2 Collection GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated 19 days ago • 18
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? Paper • 2502.12215 • Published 9 days ago • 15
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 11 days ago • 50
view article Article Open Preference Dataset for Text-to-Image Generation by the 🤗 Community Dec 9, 2024 • 54
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published 7 days ago • 75
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 8 days ago • 89
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 7 days ago • 59
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 15 days ago • 136
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs Paper • 2502.10454 • Published 14 days ago • 7
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Paper • 2502.12148 • Published 8 days ago • 16