-
High-Quality Image Restoration Following Human Instructions
Paper • 2401.16468 • Published • 12 -
Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding
Paper • 2401.15708 • Published • 10 -
Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support
Paper • 2401.14688 • Published • 13 -
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts
Paper • 2401.14828 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2401.15708
-
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding
Paper • 2401.09340 • Published • 18 -
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
Paper • 2401.12168 • Published • 25 -
Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding
Paper • 2401.15708 • Published • 10
-
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper • 2312.00777 • Published • 21 -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Paper • 2312.03641 • Published • 20 -
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 12 -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Paper • 2312.04433 • Published • 9
-
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Paper • 2309.03550 • Published • 11 -
Memory Augmented Language Models through Mixture of Word Experts
Paper • 2311.10768 • Published • 16 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 183 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 13