DOEI: Dual Optimization of Embedding Information for Attention-Enhanced Class Activation Maps Paper • 2502.15885 • Published 7 days ago • 1
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge Paper • 2502.19870 • Published 1 day ago • 3
Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications Paper • 2502.20311 • Published 1 day ago • 3
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra Paper • 2502.16284 • Published 6 days ago • 3
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization Paper • 2502.19261 • Published 2 days ago • 4
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users Paper • 2502.19312 • Published 2 days ago • 4
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement Paper • 2502.16776 • Published 5 days ago • 5
CritiQ: Mining Data Quality Criteria from Human Preferences Paper • 2502.19279 • Published 2 days ago • 6
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator Paper • 2502.19204 • Published 2 days ago • 7
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model Paper • 2502.18906 • Published 3 days ago • 8
Rank1: Test-Time Compute for Reranking in Information Retrieval Paper • 2502.18418 • Published 3 days ago • 14
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs Paper • 2502.19413 • Published 2 days ago • 14
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems Paper • 2502.19328 • Published 2 days ago • 17
Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation Paper • 2502.19414 • Published 2 days ago • 16
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 2 days ago • 19
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published 4 days ago • 23
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance Paper • 2502.18772 • Published 3 days ago • 29