Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

If you like our project, please give us a star ⭐ on GitHub for latest update.

Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang

Pipeline. We propose a two-stage framework Customize-It-3D for high-quality 3D creation from a reference image with subject-specific diffusion prior. We first cultivate subject-specific knowledge prior using multi-modal information to effectively constrain the coherency of 3D object with respect to a particular identity. At the coarse stage, we optimize a NeRF for reconstructing the geometry of the reference image in a shading-aware manner. We further build point clouds with enhanced texture from the coarse stage, and jointly optimize the texture of invisible points and a learnable deferred renderer to generate realistic and view-consistent textures.

Demo of 360° geometry

News

[2323/12/22] Code is available at GitHub!
[2323/12/20] Paper is available at ArXiv!
[2023/12/14] Our code and paper will open soon.

Install

We only test on Ubuntu 22 with torch 2.0.1 & CUDA 11.7 on an A100. Make sure git, wget, Eigen are installed.

conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
apt update && apt upgrade
apt install git wget libeigen3-dev -y

Install Environment

Install with pip:

    pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    pip install git+https://github.com/facebookresearch/pytorch3d.git
    pip install git+https://github.com/S-aiueo32/contextual_loss_pytorch.git@4585061   
    pip install ./raymarching
    pip install git+https://github.com/facebookresearch/segment-anything.git

Other dependencies:

    pip install -r requirements.txt

Download pre-trained models

Zero-1-to-3 for 3D diffusion prior. We use zero123-xl.ckpt by default, reimplementation borrowed from Stable Diffusion repo, and is available in nerf/zero123.py.
```
cd pretrained/zero123
wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
cd ../../
```

MiDaS for depth estimation. We use dpt_beit_large_512.pt. Put it in folder pretrained/midas/

mkdir -p pretrained/midas
cd pretrained/midas
wget https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_beit_large_512.pt
cd ../../

Omnidata for normal estimation.

mkdir pretrained/omnidata
cd pretrained/omnidata
# assume gdown is installed
gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' 
cd ../../

SAM to segement foreground mask of an object.

cd mask
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
cd ..

Usage

Preprocess

In the ./data directory, we have included some preprocessed files that already extracted multi-modal images. If you want to test your own example, follow the following preprocessing steps and follow the file structure in ./data. Takes seconds.

Step1: Extract multi-modal images [Optional]

You can preprocess single image.

python preprocess_image.py --path /path/to/image

You can also preprocess images in list or directory.

bash scripts/preprocess_list.sh $GPU_IDX
bash scripts/preprocess_folder.sh $GPU_IDX /path/to/dir

Step 2: Fine-tune multi-modal DreamBooth

Customize-It-3D uses the default DreamBooth from diffuers. To finetune multi-modal DreamBooth:

bash dreambooth/dreambooth.sh $GPU_IDX $INSTANCE_DIR $OUTPUT_DIR $CLASS_NAME $CLASS_DIR

$INSTANCE_DIR is the path to directory containing your own image.

$OUTPUT_DIR is the path where to save the trained model.

$CLASS_NAME is the text prompt describing the class of the generated sample images.

$CLASS_DIR is the path to a folder containing the generated class sample images.

For example:

bash dreambooth/dreambooth.sh 0 data/horse out/horse horse images_gen/horse

Don't forget the path of your trained model (in ./out directory).

Run

Run Customize-It-3D for a single example

We use progressive training strategy to generate a full 360° 3D geometry.

bash scripts/run.sh $GPU_IDX $WORK_SPACE $REF_PATH  $Enable_First_Stage $Enable_Second_Stage $TRAINED_MODEL_PATH $CLASS_NAME {More_Arugments}

As an example, run Customize-It-3D in the horse example whose trained multi-modal DreamBooth model is out/horse using both stages in GPU 0 and set the workspace and class name as horse, by the following command:

bash scripts/run.sh 0 horse data/horse/rgba/rgba.png 1 1 out/horse horse

Run Customize-It-3D for a group of examples

Run all examples in a folder, check the scripts scripts/run_folder.sh
Run all examples in a given list, check the scripts scripts/run_list.sh

Bibtex

If you find this work useful, a citation will be appreciated via:


    @misc{huang2023customizeit3d,
      title={Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior}, 
      author={Nan Huang and Ting Zhang and Yuhui Yuan and Dong Chen and Shanghang Zhang},
      year={2023},
      eprint={2312.11535},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
    }

Acknowledgments

This code borrows heavily from Stable-Dreamfusion, many thanks to the author.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
data		data
dreambooth		dreambooth
ldm		ldm
mask		mask
midas		midas
nerf		nerf
pretrained/zero123		pretrained/zero123
raymarching		raymarching
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
activation.py		activation.py
dpt.py		dpt.py
encoding.py		encoding.py
main.py		main.py
meshutils.py		meshutils.py
optimizer.py		optimizer.py
preprocess_image.py		preprocess_image.py
requirements.txt		requirements.txt

License

nnanhuang/Customize-it-3D

Folders and files

Latest commit

History

Repository files navigation

Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

If you like our project, please give us a star ⭐ on GitHub for latest update.

Demo of 360° geometry

News

Install

Install Environment

Download pre-trained models

Usage

Preprocess

Step1: Extract multi-modal images [Optional]

Step 2: Fine-tune multi-modal DreamBooth

Run

Run Customize-It-3D for a single example

Run Customize-It-3D for a group of examples

Bibtex

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Languages