Skip to content

kazuto1011/dusty-gan-v2

Repository files navigation

DUSty v2: Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data

interpolation

Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data
Kazuto Nakashima, Yumi Iwashita, Ryo Kurazume
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023
project | paper | supplemental | arxiv | slide

We propose GAN-based LiDAR data priors for sim2real and restoration tasks, which is an extension of our previous work, DUSty [Nakashima et al. IROS'21].

The core idea is to represent LiDAR range images as a continuous-image generative model or 2D neural fields. This model generates a range value and the corresponding dropout probability from a laser radiation angle. The generative process is trained using a GAN framework. For more details on the architecture, please refer to our paper and supplementary materials.

arch

Setup

Python environment + CUDA

The environment can be built using Anaconda. This command installs the CUDA 11.X runtime, however, we require PyTorch JIT compilation for the gans/ directory. Please also install the corresponding CUDA locally.

$ conda env create -f environment.yaml
$ conda activate dusty-gan-v2

Quick demo

The following demo generates random range images.

$ python quick_demo.py --arch dusty_v2

Pretrained weights are automatically downloaded. The --arch option can also be set to our baselines: vanilla and dusty_v1.

dustyv2

Dataset

To train models by your own or run the other demos, please download the KITTI Raw dataset and make a symbolic link.

$ ln -sf <path to kitti raw root> ./data/kitti_raw
$ ls ./data/kitti_raw
2011_09_26  2011_09_28  2011_09_29  2011_09_30  2011_10_03

To check the KITTI data loader:

$ python -m gans.datasets.kitti

dataset

Training GANs

To train GANs on KITTI:

$ python train_gan.py --config configs/gans/dusty_v2.yaml  # ours
$ python train_gan.py --config configs/gans/dusty_v1.yaml  # baseline
$ python train_gan.py --config configs/gans/vanilla.yaml   # baseline

To monitor losses and images:

$ tensorboard --logdir ./logs/gans

Evaluation

$ python test_gan.py --ckpt_path <path to *.pth file> --metrics swd,jsd,1nna,fpd,kpd
options modality metrics
swd 2D inverse depth maps Sliced Wasserstein distance (SWD)
jsd 3D point clouds Jensen–Shannon divergence (JSD)
1nna 3D point clouds Coverage (COV), minimum matching distance (MMD), and 1-nearest neighbor accuracy (1-NNA), based on the earth mover's distance (EMD)
fpd PointNet features Fréchet pointcloud distance (FPD)
kpd PointNet features Squared maximum mean discrepancy (like KID in the image domain)

Note: --ckpt_path can also be the following keywords: dusty_v2, dusty_v1, or vanilla. In this case, the pre-trained weights are automatically downloaded.

Demo

Latent interpolation

$ python demo_interpolation.py --mode 2d --ckpt_path <path to *.pth file>

--mode 2d

demo_interpolation_2d.mp4

--mode 3d

demo_interpolation_3d.mp4

GAN inversion

$ python demo_inversion.py --ckpt_path <path to *.pth file>
demo_inversion_0000017000.mp4

Sim2Real semantic segmentation

The semseg/ directory includes an implementation of Sim2Real semantic segmentation. The basic setup is to train the SqueezeSegV2 model [Wu et al. ICRA'19] on GTA-LiDAR (simulation) and test it on KITTI (real). To mitigate the domain gap, our paper proposed reproducing the ray-drop noises onto the simulation data using our learned GAN. For details, please refer to our paper (Section 4.2).

Dataset

  1. Please setup GTA-LiDAR (simulation) and KITTI (real) datasets provided by the SqueezeSegV2 repository.
├── GTAV  # GTA-LiDAR
│   ├──1
│   │  ├── 00000000.npy
│   │  └── ...
│   └── ...
├── ImageSet  # KITTI
│   ├── all.txt
│   ├── train.txt
│   └── val.txt
└── lidar_2d  # KITTI
    ├── 2011_09_26_0001_0000000000.npy
    └── ...
  1. Compute the raydrop probability map (64x512 shape) for each GTA-LiDAR depth map (*.npy) using the GAN inversion, and save them with the same structure. We will also release the pre-computed data.
data/kitti_raw_frontal
├── GTAV
│   ├──1
│   │  ├── 00000000.npy
│   │  └── ...
│   └── ...
├── GTAV_noise_v1  # computed with DUSty v1
│   ├──1
│   │  ├── 00000000.npy
│   │  └── ...
│   └── ...
├── GTAV_noise_v2  # computed with DUSty v2
│   ├──1
│   │  ├── 00000000.npy
│   │  └── ...
│   └── ...
  1. Finally, please make a symbolic link.
$ ln -sf <a path to the root above> ./data/kitti_raw_frontal

Training

Training configuration files can be found in configs/semseg/. We compare five approaches (config-A-E) to reproduce the raydrop noises.

$ python train_semseg.py --config <path to *.yaml file>
config training domain raydrop probability file
A Simulation configs/semseg/sim2real_wo_noise.yaml
B Simulation Global frequency configs/semseg/sim2real_w_uniform_noise.yaml
C Simulation Pixel-wise frequency configs/semseg/sim2real_w_spatial_noise.yaml
D Simulation Computed w/ DUSty v1 configs/semseg/sim2real_w_gan_noise_dustyv1.yaml
E Simulation Computed w/ DUSty v2 configs/semseg/sim2real_w_gan_noise_dustyv2.yaml
F Real N/A configs/semseg/real2real.yaml

To monitor losses and images:

$ tensorboard --logdir ./logs/semseg

Evaluation

$ python test_semseg.py --ckpt_path <path to *.pth file>

Note: --ckpt_path can also be the following keywords: clean, uniform, spatial, dusty_v1, dusty_v2, or real. In this case, the pre-trained weights are automatically downloaded.

Citation

@InProceedings{nakashima2023wacv,
    author    = {Nakashima, Kazuto and Iwashita, Yumi and Kurazume, Ryo},
    title     = {Generative Range Imaging for Learning Scene Priors of 3{D} LiDAR Data},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year      = {2023},
    pages     = {1256-1266}
}