The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 24 days ago • 182
VEnhancer: Generative Space-Time Enhancement for Video Generation Paper • 2407.07667 • Published Jul 10, 2024 • 15
Still-Moving: Customized Video Generation without Customized Video Data Paper • 2407.08674 • Published Jul 11, 2024 • 13
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging Paper • 2407.07315 • Published Jul 10, 2024 • 7
An accurate detection is not all you need to combat label noise in web-noisy datasets Paper • 2407.05528 • Published Jul 8, 2024 • 4
This&That: Language-Gesture Controlled Video Generation for Robot Planning Paper • 2407.05530 • Published Jul 8, 2024 • 4
CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation Paper • 2407.06188 • Published Jul 8, 2024 • 2
BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark Paper • 2407.07788 • Published Jul 10, 2024 • 2
Scaling Up Personalized Aesthetic Assessment via Task Vector Customization Paper • 2407.07176 • Published Jul 9, 2024 • 6
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects Paper • 2407.08711 • Published Jul 11, 2024 • 9
Generalizable Implicit Motion Modeling for Video Frame Interpolation Paper • 2407.08680 • Published Jul 11, 2024 • 12
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion Paper • 2407.08642 • Published Jul 11, 2024 • 11
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data Paper • 2407.08726 • Published Jul 11, 2024 • 11
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published Jul 11, 2024 • 17
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception Paper • 2407.08303 • Published Jul 11, 2024 • 19