Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing Using Pretrained Model Results in Different Performance Across Multiple Runs #12

Open
YunzeMan opened this issue Aug 12, 2023 · 6 comments

Comments

@YunzeMan
Copy link

Hi, Thanks for sharing your great work. As indicated in the title, I not only got lower performance (mAP@0.25: 0.7069, mAP@0.5: 0.5699), but also different performance across multiple runs.

These are several runs using the following command:
python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py ./tr3d_scannet.pth --eval mAP (Yes, I'm using the provided pretrained model on scannet)

Run 1: (mAP@0.25: 0.7069, mAP@0.5: 0.5699)
Run 2: (mAP@0.25: 0.7068, mAP@0.5: 0.5716)
Run 3: (mAP@0.25: 0.7069, mAP@0.5: 0.5710)

I'm using Pytorch 1.12, CUDA 11.3, CUDNN 8.

I cannot figure out where the stochasticity may come from, especially during the evaluation (test). Could you shed some lights on the possible reasons of this scenario?

Here is one of my outputs during the testing.
test_log.txt

@filaPro
Copy link
Contributor

filaPro commented Aug 12, 2023

Hi @YunzeMan ,

We didn't meet nondeterministic issues on ScanNet. Can you please share .log file of this run? Also may be check something from torch randomness guide?

@YunzeMan

This comment was marked as outdated.

@YunzeMan
Copy link
Author

Regarding randomness, I strictly followed your steps. I checked torch randomness guide but didn't find useful directions. Setting --deterministic=True didn't seem to help.

I also trained the model following your steps. Here is the log file.
20230812_013000.log

The mAP@0.25 and mAP@0.5 are both lower than your reported value. Have you altered the codebase or parameters a little bit without noticing it?

@filaPro
Copy link
Contributor

filaPro commented Aug 13, 2023

No, this code for sure is able to reproduce our metrics...

May by you can try our implementation in mmdetection3d codebase?

@filaPro
Copy link
Contributor

filaPro commented Aug 13, 2023

Btw, i think i understand this little randomness in test stage. Here in SparseTensor construction the default quantization_mode is RANDOM_SUBSAMPLE following MinkowskiEngine. Can you try with UNWEIGHTED_AVERAGE here?

@YunzeMan
Copy link
Author

Thanks for pointing that out. However, after changing quantization_mode of x to UNWEIGHTED_AVERAGE, the little randomness still persists. Here are results of three separate runs:

Run 1: (mAP@0.25: 0.7068, mAP@0.5: 0.5702)
Run 2: (mAP@0.25: 0.7068, mAP@0.5: 0.5697)
Run 3: (mAP@0.25: 0.7069, mAP@0.5: 0.5720)

What's more frustrating than the little randomness is the marginally lower performance. But since the gap isn't very large, I can perhaps work with the current version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants