Visiting the Invisible

This repository implements the training and testing for "Visiting the Invisible: Layer-by-Layer Completed Scene Decomposition" by Chuanxia Zheng, Duy-Son Dao (instance segmentation),Guoxian Song (data rendering), Tat-Jen Cham and Jianfei Cai.

Example

Example results of scene decomposition and recomposition. Given a single RBG image, the proposed CSDNet model is able to structurally decompose the scene into semantically completed instances, and background, while completing the RGB appearance for previously invisible regions, such as the cup. The completely decomposed instances can be used for image editing and scene recomposition, such object removal and moving without manually input annotations.

Getting started

Requirements

The code architecture is based on mmdetection (Version: 1.0rc1+621ecd2) and mmcv (Version: 0.2.15), please see https://github.com/open-mmlab/mmdetection for the installation details. We tried to update version to the latest one, but they are failed due to many functions are different between different versions.

Installation

The original code was tested with Pytorch 1.4.0, CUDA 10.0, Python 3.6 and Ubuntu 16.04 (18.04 is also supported)

conda create -n viv python=3.6 -y
conda activate viv
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Install the mmdetection (Version: 1.0rc1+621ecd2) and mmcv (Version: 0.2.15)

pip install Cython==0.29.21
pip install mmcv==0.2.15
pip install -r requirements.txt
pip install -v -e .

Datasets

CSD: Our rendered synthetic dataset, which contains 8,298 images, 95,030 instances for training and 1,012 images, 11,648 instances for testing. The dataset is built upon SUNCG. When we built the dataset (more than half year), SUNCG dataset is publicly available.
COCOA: is annotated from COCO2014, in which 5,000 images are selected to manually label with pairwise occlusion orders and amodal masks.
KINS: is derived from KITTI, in which 14,991 images are labeled with absolute layer orders and amodal masks.

Testing

Test the model

cd tools
bash test.sh

The testing and evaluation configuration can be found in test.py file.
Please select the corresponding configuration and pre-trained model for each dataset.
More settings needs to be modified in the code.
Single image visualization testing (demo). Please modify the configuration for the different inputs.

cd demo
python predictor.py

Training

Train a model (three phases in synthetic dataset)

cd tools
bash tran.sh

Configuration files are stored in configs/rgba directory.
The synthetic model is trained in three phases: decomposition, completion, and end, which can be set in the corresponding configure file by set mode.
More settings are followed as the previous works Mask-RCNN, HTC in MMdetection and PICNet.

Pretrained Models

Download the pre-trained models using the following links and put them under checkpoints directory.

CSD | COCOA | KINS

Citation

If you find our code or paper useful, please cite our paper.

@article{zheng2021vinv,
	title={Visiting the Invisible: Layer-by-Layer Completed Scene Decomposition},
	author={Zheng, Chuanxia and Dao, Duy-Son and Song, Guoxian and Cham, Tat-Jen and Cai, Jianfei},
	journal={International Journal of Computer Vision},
	pages={},
  year={2021},
  publisher={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
demo		demo
docker		docker
images		images
mmdet		mmdet
tests		tests
tools		tools
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

demo

demo

docker

docker

images

images

mmdet

mmdet

tests

tests

tools

tools

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Visiting the Invisible

Example

Getting started

Requirements

Installation

Datasets

Testing

Training

Pretrained Models

Citation

About

Releases

Packages

Languages

lyndonzheng/VINV

Folders and files

Latest commit

History

Repository files navigation

Visiting the Invisible

Example

Getting started

Requirements

Installation

Datasets

Testing

Training

Pretrained Models

Citation

About

Resources

Stars

Watchers

Forks

Languages