Mogo: RQ Hierarchical Causal Transformer for High-Quality 3D Human Motion Generation Paper • 2412.07797 • Published 20 days ago • 11
StyleMaster: Stylize Your Video with Artistic Generation and Translation Paper • 2412.07744 • Published 15 days ago • 19
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations Paper • 2412.08580 • Published 14 days ago • 44
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion Paper • 2412.09593 • Published 13 days ago • 17
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale Paper • 2412.06699 • Published 16 days ago • 11
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper • 2412.00174 • Published 26 days ago • 22
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM Paper • 2411.04954 • Published Nov 7 • 8
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper • 2411.05738 • Published Nov 8 • 14
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper • 2411.07199 • Published Nov 11 • 45
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published Nov 14 • 57
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement Paper • 2411.06558 • Published Nov 10 • 34
Sora Reference Papers Collection A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora • 30 items • Updated Oct 3 • 52
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Paper • 2312.04461 • Published Dec 7, 2023 • 58
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Paper • 2308.06721 • Published Aug 13, 2023 • 29