Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published Jan 7 • 23
Visual Representation Learning with Stochastic Frame Prediction Paper • 2406.07398 • Published Jun 11, 2024 • 1
Multi-View Masked World Models for Visual Robotic Manipulation Paper • 2302.02408 • Published Feb 5, 2023
BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark Paper • 2407.07788 • Published Jul 10, 2024 • 2
Continuous Control with Coarse-to-fine Reinforcement Learning Paper • 2407.07787 • Published Jul 10, 2024
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction Paper • 2411.14762 • Published Nov 22, 2024 • 11
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 50
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control Paper • 2403.04880 • Published Mar 7, 2024 • 6
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner Paper • 2409.12963 • Published Sep 19, 2024
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Paper • 2408.14468 • Published Aug 26, 2024 • 36
Deep Multimodal Fusion for Surgical Feedback Classification Paper • 2312.03231 • Published Dec 6, 2023
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization Paper • 2403.14973 • Published Mar 22, 2024
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources Paper • 2310.07147 • Published Oct 11, 2023 • 1
LLM Inference Unveiled: Survey and Roofline Model Insights Paper • 2402.16363 • Published Feb 26, 2024 • 2
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration Paper • 2311.08562 • Published Nov 14, 2023