Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dt_annos and gt_annos params in the calculate_iou_partly function are swapped. #25

Open
sauravshanu opened this issue Jul 21, 2021 · 6 comments

Comments

@sauravshanu
Copy link

Hi Owen,

Great work! Thanks for uploading the code and providing very clear instructions to run it.

I have two issues that I wanted to ask -

First, I noticed that at the above line dt_annos and gt_annos params in the calculate_iou_partly function are swapped. I am not sure if it matters because IoU operation is commutative.

Second, I ran training for Mono3D with the example config provided in the repo. I was trying to reproduce the results but I am always getting results which are not similar to the expected results on the validation set.

Here are my results.

Car AP(Average Precision)@0.70, 0.70, 0.70:          
bbox AP:83.29, 70.23, 54.05                          
bev  AP:10.64, 8.42, 6.52  
3d   AP:6.76, 5.10, 3.73   
aos  AP:82.75, 69.28, 53.32                          
Car AP(Average Precision)@0.70, 0.50, 0.50:          
bbox AP:83.29, 70.23, 54.05                          
bev  AP:43.14, 32.86, 25.95                          
3d   AP:37.96, 27.91, 22.64                          
aos  AP:82.75, 69.28, 53.32                          
                                                     
/**** finish testing after training epoch 19 ******/ 

Car AP(Average Precision)@0.70, 0.70, 0.70:          
bbox AP:82.41, 67.77, 51.65                          
bev  AP:9.56, 6.98, 5.34                             
3d   AP:5.48, 3.93, 3.19                             
aos  AP:81.98, 66.96, 51.00
Car AP(Average Precision)@0.70, 0.50, 0.50:          
bbox AP:82.41, 67.77, 51.65
bev  AP:42.07, 32.54, 25.82                          
3d   AP:36.23, 27.81, 21.56
aos  AP:81.98, 66.96, 51.00                          
                             
/**** finish testing after training epoch 29 ******/ 

Here are my training steps -

  1. Clone the repo and do make.sh.
  2. Download image_2, image_3, calib and label_2 from Kitti official website and unzip them.
  3. Do the exact steps from the Mono3D readme. While copying the config file. I just changed the paths to point to my directories and didn't alter any of the existing hyperparameters.
  4. I used single 2080Ti gpu for training and it takes about 6-7 hours for 30 epochs to run.

Can you please tell me what I am doing wrong here?

Thanks!

@Owen-Liuyuxuan
Copy link
Owner

Owen-Liuyuxuan commented Jul 22, 2021

I rerun a freshly cloned repo but the result is fine.

It takes about 6 hours on my 1080 Ti with SSD.

Could you provide me the tensorboard log? loss config and more.

tensorboard --logdir workdirs/Mono3D/log

@sauravshanu
Copy link
Author

Thanks for the prompt reply. Here is the loss screenshot and the tensorboard log file -

LOSS

image

Tensorboard log

Mono3D_tensorboad.log

@Owen-Liuyuxuan
Copy link
Owner

Owen-Liuyuxuan commented Jul 22, 2021

What I can directly identified is that both of my loss goes down much faster than yours.
Screenshot from 2021-07-22 14-47-37

And according to the tensorboard log, the recorded model structure is the same, and you have only changed the path and slightly increase the epoch number (which is totally fine).

events.out.tfevents.1626880244.yxliu-ramlab.10362.0.log

I can not diagnose this problem.

Here is something you can try:

  • Evaluate the pretrained model in the release on the chen split. Even though there are slight changes in the code since the release, the results should still be rather high (the release model is trained on all training data, i.e. it is trained on chen split). You could verify whether the evaluation/forward propagation is correct.
  • Modify the config file to train and evaluate on the debug split. Which is a small subset of data and we should be able to overfit on it.

By the way, calculate_iou_partly is basically borrowed from other repos and I did not modify the detail. Maybe it is a inherited harmless "bug".

@sauravshanu
Copy link
Author

OK. Here are some more evaluations.

Pretrained model provided in the release evaluated on the Validation set.

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.96, 99.85, 74.87
bev  AP:76.54, 58.14, 44.80
3d   AP:73.60, 55.31, 42.30
aos  AP:98.95, 98.83, 74.11
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.96, 99.85, 74.87
bev  AP:93.69, 81.27, 61.95
3d   AP:93.50, 80.96, 61.66
aos  AP:98.95, 98.83, 74.11

Model trained and evaluated on the debug split.

Car AP(Average Precision)@0.70, 0.70, 0.70:         
bbox AP:81.93, 86.51, 71.75                         
bev  AP:12.12, 13.56, 9.99                          
3d   AP:12.03, 12.28, 9.70                          
aos  AP:81.80, 86.25, 71.53                         
Car AP(Average Precision)@0.70, 0.50, 0.50:         
bbox AP:81.93, 86.51, 71.75                         
bev  AP:56.19, 45.85, 35.08                         
3d   AP:51.65, 41.93, 32.83                         
aos  AP:81.80, 86.25, 71.53                         
                                                    
/**** finish testing after training epoch 19 ******/

Car AP(Average Precision)@0.70, 0.70, 0.70:         
bbox AP:76.49, 81.89, 65.09                         
bev  AP:25.68, 21.57, 16.83                         
3d   AP:20.04, 16.97, 13.27                         
aos  AP:75.48, 81.02, 64.34                         
Car AP(Average Precision)@0.70, 0.50, 0.50:         
bbox AP:76.49, 81.89, 65.09                         
bev  AP:87.21, 76.10, 61.78                         
3d   AP:85.78, 75.06, 58.74                         
aos  AP:75.48, 81.02, 64.34                         
                                                    
/**** finish testing after training epoch 29 ******/

@Owen-Liuyuxuan
Copy link
Owner

When using the precompute results downloaded in the release, my result is:

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.96, 99.85, 74.87
bev  AP:96.09, 87.08, 65.24
3d   AP:95.34, 86.26, 64.62
aos  AP:99.14, 98.52, 73.89
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.96, 99.85, 74.87
bev  AP:99.91, 96.82, 71.97
3d   AP:99.90, 96.73, 71.90

When using the original precompute result on chen's split the result is exactly the same as yours.

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:99.96, 99.85, 74.87
bev  AP:76.54, 58.14, 44.80
3d   AP:73.60, 55.31, 42.30
aos  AP:98.95, 98.83, 74.11
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:99.96, 99.85, 74.87
bev  AP:93.69, 81.27, 61.95
3d   AP:93.50, 80.96, 61.66
aos  AP:98.95, 98.83, 74.11

The debug split result is rather bad. For me I can go to 50 - 60 mAP. It is clear that it is the training process that goes wrong?

@sauravshanu
Copy link
Author

OK. I'll try to fix it. This helps a lot. Thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants