RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
Abstract
We introduce RealmDreamer, a technique for generation of general forward-facing 3D scenes from text descriptions. Our technique optimizes a 3D Gaussian Splatting representation to match complex text prompts. We initialize these splats by utilizing the state-of-the-art text-to-image generators, lifting their samples into 3D, and computing the occlusion volume. We then optimize this representation across multiple views as a 3D inpainting task with image-conditional diffusion models. To learn correct geometric structure, we incorporate a depth diffusion model by conditioning on the samples from the inpainting model, giving rich geometric structure. Finally, we finetune the model using sharpened samples from image generators. Notably, our technique does not require video or multi-view data and can synthesize a variety of high-quality 3D scenes in different styles, consisting of multiple objects. Its generality additionally allows 3D synthesis from a single image.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting (2024)
- Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting (2024)
- VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation (2024)
- ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models (2024)
- Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
RealmDreamer: Revolutionizing 3D Scenes from Text with Advanced Inpainting & Depth Diffusion
Links π:
π Subscribe: https://www.youtube.com/@Arxflix
π Twitter: https://x.com/arxflix
π LMNT (Partner): https://lmnt.com/
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper