HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Paper • 2502.12148 • Published 8 days ago • 16
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Paper • 2502.05979 • Published 16 days ago • 7
Latent Radiance Fields with 3D-aware 2D Representations Paper • 2502.09613 • Published 12 days ago • 6
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation Paper • 2502.08690 • Published 13 days ago • 39
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 14 days ago • 44
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Paper • 2502.08639 • Published 13 days ago • 36
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion Paper • 2502.08590 • Published 13 days ago • 38
CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing Paper • 2502.03997 • Published 20 days ago • 9
Magic 1-For-1: Generating One Minute Video Clips within One Minute Paper • 2502.07701 • Published 14 days ago • 32
CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers Paper • 2502.06527 • Published 15 days ago • 9
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Paper • 2502.05179 • Published 18 days ago • 22
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting Paper • 2502.05176 • Published 18 days ago • 30
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Paper • 2502.01618 • Published 22 days ago • 9
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper • 2502.04320 • Published 19 days ago • 33