New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why need time.sleep(2) in EpochBasedRunner ? when the deadlock will happen ? #1640
Comments
Hi, this is a workaround to resolve the possible deadlock in dataloader. More details can be found at pytorch/pytorch#1355 (comment). We will find out a more elegant way to resolve the problem. |
Hi @zhouzaida, |
Hi @JihwanEom , in most cases the deadlock will not happen so this line can be removed from your local mmcv which will speed up your training. |
Okay, I got it. But could you explain the expected situation for possible deadlock? |
Hi, @zhouzaida. As @JihwanEom said, I wonder which cases the deadlock can occur when removing |
why need
time.sleep(2)
in https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/epoch_based_runner.py#L46the statement says: Prevent possible deadlock during epoch transition.
when possible deadlock will happen. and more impotant ,
time.sleep(2)
is not elegant and flexible operation. It waste time in small dataset. I thick it is necessary to redesign.The text was updated successfully, but these errors were encountered: