Submitted by akhaliq 179 The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding · 8 authors 3
Submitted by geonp 138 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU · 4 authors 6
Submitted by Agorium 38 Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation · 4 authors 2
Submitted by jonkahana 31 Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights · 4 authors 2
Submitted by Ray2333 31 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents · 13 authors 2
Submitted by Lp256 31 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models · 11 authors 3
Submitted by akhaliq 29 An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging · 4 authors 4
Submitted by voidism 29 SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models · 9 authors 2
Submitted by Neph0s 26 CoSER: Coordinating LLM-Based Persona Simulation of Established Roles · 12 authors 2
Submitted by CaraJ 26 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency · 14 authors 2
Submitted by ZiyuG 25 Exploring the Potential of Encoder-free Architectures in 3D LMMs · 11 authors 2
Submitted by danf 15 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models · 4 authors 2
Submitted by xymeow7 12 DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References · 5 authors 2
Submitted by Haon-Chen 12 mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data · 7 authors 2
Submitted by guactastesgood 10 Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges · 3 authors 2
Submitted by BestWishYsh 7 VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer · 7 authors 2
Submitted by enquan2022 6 3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly · 7 authors 2