Awesome 3D AIGC Resources

A curated list of papers and open-source resources focused on 3D AIGC, intended to keep pace with the anticipated surge of research in the coming months. If you have any additions or suggestions, feel free to contribute. Additional resources like blog posts, videos, etc. are also welcome.

Survey:

1. Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era [arxiv 2023.05]

Authors: Chenghao Li, Chaoning Zhang, Atish Waghwase, Lik-Hang Lee, Francois Rameau, Yang Yang, Sung-Ho Bae, Choong Seon Hong

Abstract

Generative AI (AIGC, a.k.a. AI generated content) has made remarkable progress in the past few years, among which text-guided content generation is the most practical one since it enables the interaction between human instruction and AIGC. Due to the development in text-to-image as well 3D modeling technologies (like NeRF), text-to-3D has become a newly emerging yet highly active research field. Our work conducts the first yet comprehensive survey on text-to-3D to help readers interested in this direction quickly catch up with its fast development. First, we introduce 3D data representations, including both Euclidean data and non-Euclidean data. On top of that, we introduce various foundation technologies as well as summarize how recent works combine those foundation technologies to realize satisfactory text-to-3D. Moreover, we summarize how text-to-3D technology is used in various applications, including avatar generation, texture generation, shape transformation, and scene generation.

📄 Paper

2. Deep Generative Models on 3D Representations: A Survey [arxiv 2023.10]

Authors: Zifan Shi, Sida Peng, Yinghao Xu, Andreas Geiger, Yiyi Liao, Yujun Shen

Abstract

Generative models aim to learn the distribution of observed data by generating new instances. With the advent of neural networks, deep generative models, including variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models (DMs), have progressed remarkably in synthesizing 2D images. Recently, researchers started to shift focus from 2D to 3D space, considering that 3D data is more closely aligned with our physical world and holds immense practical potential. However, unlike 2D images, which possess an inherent and efficient representation (\textit{i.e.}, a pixel grid), representing 3D data poses significantly greater challenges. Ideally, a robust 3D representation should be capable of accurately modeling complex shapes and appearances while being highly efficient in handling high-resolution data with high processing speeds and low memory requirements. Regrettably, existing 3D representations, such as point clouds, meshes, and neural fields, often fail to satisfy all of these requirements simultaneously. In this survey, we thoroughly review the ongoing developments of 3D generative models, including methods that employ 2D and 3D supervision. Our analysis centers on generative models, with a particular focus on the representations utilized in this context. We believe our survey will help the community to track the field's evolution and to spark innovative ideas to propel progress towards solving this challenging task.

📄 Paper | 🌐 Project Page

3. A survey of deep learning-based 3D shape generation [Computational Visual Media 2023.05]

Authors: Qun-Ce Xu, Tai-Jiang Mu, Yong-Liang Yang

Abstract

Deep learning has been successfully used for tasks in the 2D image domain. Research on 3D computer vision and deep geometry learning has also attracted attention. Considerable achievements have been made regarding feature extraction and discrimination of 3D shapes. Following recent advances in deep generative models such as generative adversarial networks, effective generation of 3D shapes has become an active research topic. Unlike 2D images with a regular grid structure, 3D shapes have various representations, such as voxels, point clouds, meshes, and implicit functions. For deep learning of 3D shapes, shape representation has to be taken into account as there is no unified representation that can cover all tasks well. Factors such as the representativeness of geometry and topology often largely affect the quality of the generated 3D shapes. In this survey, we comprehensively review works on deep-learning-based 3D shape generation by classifying and discussing them in terms of the underlying shape representation and the architecture of the shape generator. The advantages and disadvantages of each class are further analyzed. We also consider the 3D shape datasets commonly used for shape generation. Finally, we present several potential research directions that hopefully can inspire future works on this topic.

📄 Paper

4. Learning Generative Models of 3D Structures [Computer Graphics Forum 2020.05]

Authors: Siddhartha Chaudhuri, Daniel Ritchie, Jiajun Wu, Kai Xu, Hao Zhang

Abstract

3D models of objects and scenes are critical to many academic disciplines and industrial applications. Of particular interest is the emerging opportunity for 3D graphics to serve artificial intelligence: computer vision systems can benefit from synthetically-generated training data rendered from virtual 3D scenes, and robots can be trained to navigate in and interact with real-world environments by first acquiring skills in simulated ones. One of the most promising ways to achieve this is by learning and applying generative models of 3D content: computer programs that can synthesize new 3D shapes and scenes. To allow users to edit and manipulate the synthesized 3D content to achieve their goals, the generative model should also be structure-aware: it should express 3D shapes and scenes using abstractions that allow manipulation of their high-level structure. This state-of-the-art report surveys historical work and recent progress on learning structure-aware generative models of 3D shapes and scenes. We present fundamental representations of 3D shape and scene geometry and structures, describe prominent methodologies including probabilistic models, deep generative models, program synthesis, and neural networks for structured data, and cover many recent methods for structure-aware synthesis of 3D shapes and indoor scenes.

📄 Paper

5. A Survey on 3D Gaussian Splatting [arxiv 2024.01]

Authors: Guikun Chen, Wenguan Wang

Abstract

3D Gaussian splatting (3D GS) has recently emerged as a transformative technique in the explicit radiance field and computer graphics landscape. This innovative approach, characterized by the utilization of millions of 3D Gaussians, represents a significant departure from the neural radiance field (NeRF) methodologies, which predominantly use implicit, coordinate-based models to map spatial coordinates to pixel values. 3D GS, with its explicit scene representations and differentiable rendering algorithms, not only promises real-time rendering capabilities but also introduces unprecedented levels of control and editability. This positions 3D GS as a potential game-changer for the next generation of 3D reconstruction and representation. In the present paper, we provide the first systematic overview of the recent developments and critical contributions in the domain of 3D GS. We begin with a detailed exploration of the underlying principles and the driving forces behind the advent of 3D GS, setting the stage for understanding its significance. A focal point of our discussion is the practical applicability of 3D GS. By facilitating real-time performance, 3D GS opens up a plethora of applications, ranging from virtual reality to interactive media and beyond. This is complemented by a comparative analysis of leading 3D GS models, evaluated across various benchmark tasks to highlight their performance and practical utility. The survey concludes by identifying current challenges and suggesting potential avenues for future research in this domain. Through this survey, we aim to provide a valuable resource for both newcomers and seasoned researchers, fostering further exploration and advancement in applicable and explicit radiance field representation.

[📄 Paper](https://arxiv.org/abs/2401.03890)

6. A Survey on Deep Generative 3D-aware Image Synthesis [ACM Computing Surveys 2023.11]

Authors: WEIHAO XIA JING-HAO XUE

Abstract

Recent years have seen remarkable progress in deep learning powered visual content creation. This includes deep generative 3D-aware image synthesis, which produces high-fidelity images in a 3D-consistent manner while simultaneously capturing compact surfaces of objects from pure image collections without the need for any 3D supervision, thus bridging the gap between 2D imagery and 3D reality. The field of computer vision has been recently captivated by the task of deep generative 3D-aware image synthesis, with hundreds of papers appearing in top-tier journals and conferences over the past few years (mainly the past two years), but there lacks a comprehensive survey of this remarkable and swift progress. Our survey aims to introduce new researchers to this topic, provide a useful reference for related works, and stimulate future research directions through our discussion section. Apart from the presented papers, we aim to constantly update the latest relevant papers along with corresponding implementations at https://weihaox.github.io/3D-aware-Gen.

📄 Paper | 🌐 Project Page

7. Advances in 3D Generation: A Survey [arxiv 2024.01]

Authors: Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shan

Abstract

Generating 3D models lies at the core of computer graphics and has been the focus of decades of research. With the emergence of advanced neural representations and generative models, the field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models. The rapid growth of this field makes it difficult to stay abreast of all recent developments. In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications. Specifically, we introduce the 3D representations that serve as the backbone for 3D generation. Furthermore, we provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms, including feedforward generation, optimization-based generation, procedural generation, and generative novel view synthesis. Lastly, we discuss available datasets, applications, and open challenges. We hope this survey will help readers explore this exciting topic and foster further advancements in the field of 3D content generation.

📄 Paper

Text to 3D Generation:

1. DreamFusion: Text-to-3D using 2D Diffusion [ICLR 2023]

Authors: Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall

Abstract

Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs. Adapting this approach to 3D synthesis would require large-scale datasets of labeled 3D assets and efficient architectures for denoising 3D data, neither of which currently exist. In this work, we circumvent these limitations by using a pretrained 2D text-to-image diffusion model to perform text-to-3D synthesis. We introduce a loss based on probability density distillation that enables the use of a 2D diffusion model as a prior for optimization of a parametric image generator. Using this loss in a DeepDream-like procedure, we optimize a randomly-initialized 3D model (a Neural Radiance Field, or NeRF) via gradient descent such that its 2D renderings from random angles achieve a low loss. The resulting 3D model of the given text can be viewed from any angle, relit by arbitrary illumination, or composited into any 3D environment. Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors.

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
README.md		README.md

mdyao/Awesome-3D-AIGC

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Awesome 3D AIGC Resources

Table of contents

Survey:

1. Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era [arxiv 2023.05]

2. Deep Generative Models on 3D Representations: A Survey [arxiv 2023.10]

3. A survey of deep learning-based 3D shape generation [Computational Visual Media 2023.05]

4. Learning Generative Models of 3D Structures [Computer Graphics Forum 2020.05]

5. A Survey on 3D Gaussian Splatting [arxiv 2024.01]

6. A Survey on Deep Generative 3D-aware Image Synthesis [ACM Computing Surveys 2023.11]

7. Advances in 3D Generation: A Survey [arxiv 2024.01]

Text to 3D Generation:

1. DreamFusion: Text-to-3D using 2D Diffusion [ICLR 2023]

2. Shapeglot: Learning Language for Shape Differentiation [CVPR 2019]

3. Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings [ACCV 2018]

4. ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model [NeurIPS 2022]

5. Magic3D: High-Resolution Text-to-3D Content Creation [CVPR 2023]

6. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models [CVPR 2023]

7. CLIP-Mesh: Generating textured meshes from text using pretrained image-text models [SIGGRAPH ASIA 2022]

8. Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation [CVPR 2023]

9. Dream Fields: Zero-Shot Text-Guided Object Generation with Dream Fields [CVPR 2022 and AI4CC 2022 (Best Poster)]

10. RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D [arxiv 2023.11]

11. SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity [arxiv 2024.01]

12. Taming Mode Collapse in Score Distillation for Text-to-3D Generation [arxiv 2024.01]

13. DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation [arxiv 2023.09]

14. HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance [arxiv 2023.05]

15. Triplane Meets Gaussian Splatting [arxiv 2023.12]

16. DMV3D:Denoising Multi-View Diffusion using 3D Large Reconstruction Model [ICLR 2024]

17. Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model [arxiv 2023.11]

18. Instant3D: Instant Text-to-3D Generation [arxiv 2023.11]

19. Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts [ICLR 2024]

20. DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation [ICLR 2024]

21. TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy TextFields [ICLR 2024]

22. Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation [arxiv 2024.01]

23. Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior [arxiv 2024.01]

24. AToM: Amortized Text-to-Mesh using 2D Diffusion [arxiv 2024.02]

25. EscherNet: A Generative Model for Scalable View Synthesis [arxiv 2024.02]

26. TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts [arxiv 2024.01]

27. BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion [arxiv 2024.01]

28. AToM: Amortized Text-to-Mesh using 2D Diffusion [arxiv 2024.02]

29. IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation [arxiv 2024.02]

30. GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting [arxiv 2024.02]

31. GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation [arxiv 2024.01]

Image to 3D Generation:

1. Zero-1-to-3: Zero-shot One Image to 3D Object [ICCV 2023]

2. Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model [arxiv 2023.10]

3. One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization [arxiv 2023.06]

4. One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion [arxiv 2023.11]

5. TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion [ICLR 2024]

6. Wonder3d: Single Image to 3D using Cross-Domain Diffusion [arxiv 2023.10]

7. LRM: Large Reconstruction Model for Single Image to 3D [ICLR 2024]

8. Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors [arxiv 2023.06]

9. DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior [arxiv 2023.10]

10. GD^2-NeRF: Generative Detail Compensation via GAN and Diffusion for One-shot Generalizable Neural Radiance Fields [arxiv 2024.01]

11. NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views [arxiv 2022.11]

12. Free3D: Consistent Novel View Synthesis without 3D Representation [arxiv 2023.12]

13. AGG: Amortized Generative 3D Gaussians for Single Image to 3D [arxiv 2024.01]

14. What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs [arxiv 2024.01]

15. TOSS: High-quality Text-guided Novel View Synthesis from a Single Image [ICLR 2024]

16. Make-it-3D: high-fidelity 3d creation from a single image with diffusion prior [ICCV 2023]

17. IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images [arxiv 2024.01]

18. Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [arxiv 2024.01]

19. TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts [arxiv 2024.01]

20. 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D [arxiv 2024.01]

21. GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting [arxiv 2024.02]

22. Colorizing Monochromatic Radiance Fields [arxiv 2024.02]

23. DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer [arxiv 2024.02]

Audio to 3D Generation:

1. From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations [arxiv 2024.01]

3D Editing:

1. DreamEditor: Text-Driven 3D Scene Editing with Neural Fields [arxiv 2023.06]

2. IDE-3D: Interactive Disentangled Editing For High-Resolution 3D-aware Portrait Synthesis [arxiv 2022.05]

3. DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images [ICLR 2023]

4. Image Sculpting: Precise Object Editing with 3D Geometry Control [arxiv 2024.01]

14. HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance　[arxiv 2023.05]