Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataLoader worker (pid 12847) is killed by signal: Killed #4

Closed
ljjcoder opened this issue Oct 25, 2021 · 5 comments
Closed

DataLoader worker (pid 12847) is killed by signal: Killed #4

ljjcoder opened this issue Oct 25, 2021 · 5 comments

Comments

@ljjcoder
Copy link

When I run python flexmatch.py --c ./config/flexmatch/flexmatch_cifar100_400_1.yaml, I always get the following error.
Can you help me to fix the mistake?

1635170601

@ljjcoder
Copy link
Author

1635170601

@qianlanwyd
Copy link
Member

You should use a CPU with a big RAM. May I ask what CPU are you using?

@18445864529
Copy link
Member

There's probably an OOM. You can use dmesg -T to verify whether OOM occurred after encountering this error. If that's the case, consider reducing the batch size and num_workers.
The following links may provide some helpful insights into this error.
pytorch/pytorch#5040
pytorch/pytorch#1355

@avihu111
Copy link

avihu111 commented Nov 14, 2021

I also encounter this OOM issue with 32GB RAM for running training on CIFAR10.
How can I optimize the memory consumption?
Edit - when using one GPU the RAM usage is only 4GB. (x8 less than with using 2 GPUs)

@18445864529
Copy link
Member

I also encounter this OOM issue with 32GB RAM for running training on CIFAR10. How can I optimize the memory consumption? Edit - when using one GPU the RAM usage is only 4GB. (x8 less than with using 2 GPUs)

Hi, do you mean 32GB RAM of your CPU? I think this error is more a GPU OOM issue (check by dmesg -T whether there is cuda out of memory), could you indicate your GPU memory, the batch size and the num_workers that you were using? And to reduce CPU RAM usage you can try reducing the num_workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants