Skip to content

Latest commit

 

History

History
68 lines (45 loc) · 4.68 KB

README.md

File metadata and controls

68 lines (45 loc) · 4.68 KB

MODEL ZOO

Common settings and notes

  • The experiments are run with PyTorch 1.1, CUDA 10.0, and CUDNN 7.5.
  • The training is conducted on 4 V100 GPUs in a DGX server.
  • Testing times are measured on a TITAN RTX GPU with batch size 1.

Waymo 3D Detection

We provide training / validation configurations, pretrained models, and prediction files for all models in the paper. To access these pretrained models, please send us an email with your name, institute, a screenshot of the the Waymo dataset registration confirmation mail, and your intended usage. Please send a second email if we don't get back to you in two days. Please note that Waymo open dataset is under strict non-commercial license so we are not allowed to share the model with you if it will used for any profit-oriented activities.

One-stage VoxelNet

Model Veh_L2 Ped_L2 Cyc_L2 MAPH FPS
VoxelNet 66.2 62.6 67.6 65.5 13

In the paper, our models only detect Vehicle and Pedestrian. Here, we provide the three classes config that also enables cyclist detection (and perform similarly). We encourage the community to also report three class performance in the future.

Ablations for training schedule

CenterPoint is fast to train and converge in as little as 3~6 epochs. We tried a few training schedules for CenterPoint-Voxel and list their performance below.

Schedule Veh_L2 Ped_L2 Cyc_L2 MAPH Training Time
36 epoch 66.2 62.6 67.6 65.5 84hr
12 epoch 65.6 61.3 67.1 64.7 28hr
6 epoch 65.5 59.5 66.4 63.4 14hr
3 epoch 61.5 56.2 64.5 60.7 7hr

Two-stage VoxelNet

By default, we finetune a pretrained one stage model for 6 epochs. To save GPU memory, we also freeze the backbone weight.

Model Split Veh_L2 Ped_L2 Cyc_L2 MAPH FPS
VoxelNet Val 67.9 65.6 68.6 67.4 13
VoxelNet Test 71.9 67.0 68.2 69.0 13

Two frame model

To provide richer input information and enable a more reasonable velocity estimation, we transform and merge the Lidar points of previous frame into current frame. This two frame model significanty boosts the detection performance.

Model Split Veh_L2 Ped_L2 Cyc_L2 MAPH FPS
One-stage Val 67.3 67.5 69.9 68.2 11
Two-stage Val 69.7 70.3 70.9 70.3 11
Two-stage Test 73.0 71.5 71.3 71.9 11

PointPillars

Model Veh_L2 Ped_L2 Cyc_L2 MAPH FPS
centerpoint_pillar 65.5 55.1 60.2 60.3 19
centerpoint_pillar_two_stage 66.7 55.9 61.7 61.4 16

For PointPillars, we notice a 1.5 mAPH drop when converting from two class model to three class model. You can refer to ONE_STAGE and TWO_STAGE configs to reproduce the two class result.

Waymo 3D Tracking

For 3D Tracking, we apply our center-based tracking on top of our two frame model's detection result.

Split Veh_L2 Ped_L2 Cyc_L2 MOTA FPS
centerpoint_voxel_two_sweep Val 55.0 55.0 57.4 55.8 11
centerpoint_voxel_two_sweep Test 59.4 56.6 60.0 58.7 11