SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval Paper • 2412.15443 • Published 6 days ago • 4
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World Paper • 2412.17589 • Published 2 days ago • 8
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 1 day ago • 21
DateLogicQA: Benchmarking Temporal Biases in Large Language Models Paper • 2412.13377 • Published 8 days ago • 2
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 7 days ago • 103
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 6 days ago • 30
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 8 days ago • 30
Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models Paper • 2412.12606 • Published 8 days ago • 41
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 8 days ago • 40
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 7 days ago • 43