Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
Abstract
Diffusion models have achieved great success in generating 2D images. However, the quality and generalizability of 3D content generation remain limited. State-of-the-art methods often require large-scale 3D assets for training, which are challenging to collect. In this work, we introduce Kiss3DGen (Keep It Simple and Straightforward in 3D Generation), an efficient framework for generating, editing, and enhancing 3D objects by repurposing a well-trained 2D image diffusion model for 3D generation. Specifically, we fine-tune a diffusion model to generate ''3D Bundle Image'', a tiled representation composed of multi-view images and their corresponding normal maps. The normal maps are then used to reconstruct a 3D mesh, and the multi-view images provide texture mapping, resulting in a complete 3D model. This simple method effectively transforms the 3D generation problem into a 2D image generation task, maximizing the utilization of knowledge in pretrained diffusion models. Furthermore, we demonstrate that our Kiss3DGen model is compatible with various diffusion model techniques, enabling advanced features such as 3D editing, mesh and texture enhancement, etc. Through extensive experiments, we demonstrate the effectiveness of our approach, showcasing its ability to produce high-quality 3D models efficiently.
Community
Project page: https://ltt-o.github.io/Kiss3dgen.github.io/
Huggingface demo: https://huggingface.co/spaces/LTT/Kiss3DGen
Github repo: https://github.com/EnVision-Research/Kiss3DGen
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation (2025)
- Instructive3D: Editing Large Reconstruction Models with Text Instructions (2025)
- Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation (2025)
- F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting (2025)
- Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance (2025)
- Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models (2025)
- DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper