Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask
Abstract
We introduce a novel approach that takes a single <PRE_TAG>semantic mask</POST_TAG> as input to synthesize <PRE_TAG>multi-view consistent color images</POST_TAG> of natural scenes, trained with a collection of single images from the Internet. Prior works on 3D-aware image synthesis either require <PRE_TAG>multi-view supervision</POST_TAG> or learning category-level prior for specific classes of objects, which can hardly work for natural scenes. Our key idea to solve this challenging problem is to use a semantic field as the intermediate representation, which is easier to reconstruct from an input <PRE_TAG>semantic mask</POST_TAG> and then translate to a <PRE_TAG>radiance field</POST_TAG> with the assistance of <PRE_TAG>off-the-shelf semantic image synthesis models</POST_TAG>. Experiments show that our method outperforms baseline methods and produces photorealistic, multi-view consistent videos of a variety of natural scenes.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper