Skip to content

Commit

Permalink
add Stereo and KM3D
Browse files Browse the repository at this point in the history
improve documentations
  • Loading branch information
Owen-Liuyuxuan committed Mar 18, 2021
1 parent 920f6f8 commit b3f8f6e
Show file tree
Hide file tree
Showing 44 changed files with 4,745 additions and 52 deletions.
31 changes: 24 additions & 7 deletions README.md
Expand Up @@ -3,11 +3,11 @@
This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from the current directory, and treat ./visualDet3D as a package that we could modify and test directly instead of a library. Several useful scripts are provided in the main directory for easy usage.

We believe that visual tasks are interconnected, so we make this library extensible to more experiments.
The package uses registry to register datasets, models, processing functions and more allowing easy inserting of new tasks/models while not interfere with the existing ones.
The package uses registry to register datasets, models, processing functions and more, allowing easy inserting of new tasks/models while not interfere with the existing ones.

## Related Paper:

This repo contains the official implementation of 2021 *RAL* paper [**Ground-aware Monocular 3D Object Detection for Autonomous Driving**](https://ieeexplore.ieee.org/document/9327478). [Arxiv Page](https://arxiv.org/abs/2102.00690). Pretrained model can be found at release pages.
This repo contains the official implementation of 2021 *RAL* \& *ICRA* paper [**Ground-aware Monocular 3D Object Detection for Autonomous Driving**](https://ieeexplore.ieee.org/document/9327478). [Arxiv Page](https://arxiv.org/abs/2102.00690). Pretrained model can be found at [release pages](https://github.com/Owen-Liuyuxuan/visualDet3D/releases/tag/1.0).
```
@ARTICLE{9327478,
author={Y. {Liu} and Y. {Yuan} and M. {Liu}},
Expand All @@ -16,6 +16,20 @@ This repo contains the official implementation of 2021 *RAL* paper [**Ground-awa
year={2021},
doi={10.1109/LRA.2021.3052442}}
```

Also the official implementation of 2021 *ICRA* paper [**YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection**](https://arxiv.org/abs/2103.09422). Pretrained model can be found at [release pages](https://github.com/Owen-Liuyuxuan/visualDet3D/releases/tag/1.1).
```
@inproceedings{liu2021yolostereo3d,
title={YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection},
author={Yuxuan Liu and Lujia Wang and Ming, Liu},
booktitle={2021 International Conference on Robotics and Automation (ICRA)},
year={2021},
organization={IEEE}
}
```

We further incorperate an *Unofficial* re-implementation of **Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training** (KM3D) as a reference on how to integrate with other frameworks. (Notice that the codes are from the [originally official repo](https://github.com/Banconxuan/RTM3D), and we **DO NOT** guarantee a complete re-implementation).

## Key Features

- **SOTA Performance** State of the art result on visual 3D detection.
Expand All @@ -26,7 +40,7 @@ This repo contains the official implementation of 2021 *RAL* paper [**Ground-awa
- **Global Path-based IMDB** Do not need data placed inside the folder, convienient for managing data and code separately.


We provide start-up solutions for [Mono3D](docs/mono3d.md), [Depth Predictions](docs/monoDepth.md) and more (until further publication).
We provide start-up solutions for [Mono3D](docs/mono3d.md), [Stereo3D](docs/stereo3d.md), [Depth Predictions](docs/monoDepth.md) and more (until further publication).

Reference: this repo borrows codes and ideas from [retinanet](https://github.com/yhenon/pytorch-retinanet),
[mmdetection](https://github.com/open-mmlab/mmdetection),
Expand All @@ -44,13 +58,13 @@ pip3 install -r requirement.txt
or manually check dependencies.

```bash
# build ops (deform convs), We will not install operations into the system environment
# build ops (deform convs and iou3d), We will not install operations into the system environment
./make.sh
```

## Start Training

Please check the corresponding task: [Mono3D](docs/mono3d.md), [Depth Predictions](docs/monoDepth.md). More demo will be available through contributions and further paper submission.
Please check the corresponding task: [Mono3D](docs/mono3d.md), [Stereo3D](docs/stereo3d.md) [Depth Predictions](docs/monoDepth.md). More demo will be available through contributions and further paper submission.

### Config and Path setup.

Expand Down Expand Up @@ -78,12 +92,15 @@ Please check the template's comments and other comments in codes to fully exploi
## Other Resources

- [RAM-LAB](https://www.ram-lab.com)
- [Collections of Papers and Readings](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/); [Collection for Mono3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/RecentCollectionForMono3D/); [Ground-Aware 3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/)
- [Collections of Papers and Readings](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/);
- [Collection for Mono3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/RecentCollectionForMono3D/); [Ground-Aware 3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/)
- [Collection for Stereo3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/RecentCollectionForStereo3D/); [YOLOStereo3D](https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/YOLOStereo3D/)

## Related Codes

- [MMDetection](https://github.com/open-mmlab/mmdetection)
- [M3D-RPN](https://github.com/garrickbrazil/M3D-RPN)
- [Retinanet](https://github.com/yhenon/pytorch-retinanet)
- [DORN](https://github.com/dontLoveBugs/SupervisedDepthPrediction)
- [det3](https://github.com/pyun-ram/FL3D)
- [det3](https://github.com/pyun-ram/FL3D)
- [RTM3D](https://github.com/Banconxuan/RTM3D)
165 changes: 165 additions & 0 deletions config/KM3D_example
@@ -0,0 +1,165 @@
from easydict import EasyDict as edict
import os
import numpy as np

cfg = edict()
cfg.obj_types = ['Car', 'Pedestrian', 'Cyclist']
cfg.anchor_prior = False
## trainer
trainer = edict(
gpu = 0,
max_epochs = 200,
disp_iter = 50,
save_iter = 20,
test_iter = 20,
cudnn = True,
training_func = "train_rtm3d",
test_func = "test_mono_detection",
evaluate_func = "evaluate_kitti_obj",
)

cfg.trainer = trainer

## path
path = edict()
path.data_path = "/home/kitti_obj/training"
path.test_path = "/home/kitti_obj/testing"
path.visualDet3D_path = "/home/stereo_kitti/visualDet3D"
path.project_path = "/home/stereo_kitti/workdirs"

if not os.path.isdir(path.project_path):
os.mkdir(path.project_path)
path.project_path = os.path.join(path.project_path, 'RTM3D')
if not os.path.isdir(path.project_path):
os.mkdir(path.project_path)

path.log_path = os.path.join(path.project_path, "log")
if not os.path.isdir(path.log_path):
os.mkdir(path.log_path)

path.checkpoint_path = os.path.join(path.project_path, "checkpoint")
if not os.path.isdir(path.checkpoint_path):
os.mkdir(path.checkpoint_path)

path.preprocessed_path = os.path.join(path.project_path, "output")
if not os.path.isdir(path.preprocessed_path):
os.mkdir(path.preprocessed_path)

path.train_imdb_path = os.path.join(path.preprocessed_path, "training")
if not os.path.isdir(path.train_imdb_path):
os.mkdir(path.train_imdb_path)

path.val_imdb_path = os.path.join(path.preprocessed_path, "validation")
if not os.path.isdir(path.val_imdb_path):
os.mkdir(path.val_imdb_path)

cfg.path = path

## optimizer
optimizer = edict(
type_name = 'adam',
keywords = edict(
lr = 1.25e-4,
weight_decay = 0,
),
clipped_gradient_norm = 35.0
)
cfg.optimizer = optimizer
## scheduler
scheduler = edict(
type_name = 'MultiStepLR',
keywords = edict(
milestones = [90, 120]
)
)
cfg.scheduler = scheduler

## data
data = edict(
batch_size = 32,
num_workers = 4,
rgb_shape = (384, 1280, 3),
train_dataset = "KittiRTM3DDataset",
val_dataset = "KittiMonoDataset",
test_dataset = "KittiMonoTestDataset",
train_split_file = os.path.join(cfg.path.visualDet3D_path, 'data', 'kitti', 'chen_split', 'train.txt'),
val_split_file = os.path.join(cfg.path.visualDet3D_path, 'data', 'kitti', 'chen_split', 'val.txt'),
max_occlusion = 4,
min_z = 3,
)

data.augmentation = edict(
rgb_mean = np.array([0.485, 0.456, 0.406]),
rgb_std = np.array([0.229, 0.224, 0.225]),
cropSize = (data.rgb_shape[0], data.rgb_shape[1]),
)
data.train_augmentation = [
edict(type_name='ConvertToFloat'),
edict(type_name='RandomWarpAffine', keywords=edict(output_w=data.augmentation.cropSize[1], output_h=data.augmentation.cropSize[0])),
#edict(type_name='Resize', keywords=edict(size=data.augmentation.cropSize)),
edict(type_name="Shuffle", keywords=edict(
aug_list=[
edict(type_name="RandomBrightness", keywords=edict(distort_prob=1.0)),
edict(type_name="RandomContrast", keywords=edict(distort_prob=1.0, lower=0.6, upper=1.4)),
edict(type_name="Compose", keywords=edict(
aug_list=[
edict(type_name="ConvertColor", keywords=edict(transform='HSV')),
edict(type_name="RandomSaturation", keywords=edict(distort_prob=1.0, lower=0.6, upper=1.4)),
edict(type_name="ConvertColor", keywords=edict(current='HSV', transform='RGB')),
]
))
]
)
),
edict(type_name='RandomEigenvalueNoise', keywords=edict(alphastd=0.1)),
edict(type_name='RandomMirror', keywords=edict(mirror_prob=0.5)),
edict(type_name="FilterObject"),
edict(type_name='Normalize', keywords=edict(mean=data.augmentation.rgb_mean, stds=data.augmentation.rgb_std))
]
data.test_augmentation = [
edict(type_name='ConvertToFloat'),
#edict(type_name='CropTop', keywords=edict(crop_top_index=data.augmentation.crop_top)),
edict(type_name='Resize', keywords=edict(size=data.augmentation.cropSize)),
edict(type_name='Normalize', keywords=edict(mean=data.augmentation.rgb_mean, stds=data.augmentation.rgb_std))
]
cfg.data = data

## networks
detector = edict()
detector.obj_types = cfg.obj_types
detector.name = 'KM3D'
detector.backbone = edict(
depth=18,
pretrained=True,
frozen_stages=-1,
num_stages=4,
out_indices=(3, ),
norm_eval=False,
dilations=(1, 1, 1, 1),
)
head_loss = edict(
gamma=2.0,
rampup_length = 100,
output_w = data.rgb_shape[1] // 4
)
head_test = edict(
score_thr=0.3,
)

head_layer = edict(
input_features=256,
head_features=64,
head_dict={'hm': len(cfg.obj_types), 'wh': 2, 'hps': 18,
'rot': 8, 'dim': 3, 'prob': 1,
'reg': 2, 'hm_hp': 9, 'hp_offset': 2}
)
detector.head = edict(
num_classes = len(cfg.obj_types),
num_joints = 9,
max_objects = 32,
layer_cfg = head_layer,
loss_cfg = head_loss,
test_cfg = head_test
)
detector.loss = head_loss
cfg.detector = detector

0 comments on commit b3f8f6e

Please sign in to comment.