Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get stuck #2167

Open
dream-mjq opened this issue May 8, 2024 · 9 comments
Open

Get stuck #2167

dream-mjq opened this issue May 8, 2024 · 9 comments
Assignees

Comments

@dream-mjq
Copy link

Why is it that when I use v1 algorithm for data processing, I often get stuck in 3D processing stage1. And I always have the same problem. I do hope you can give me some suggestions.

@constantinulrich
Copy link
Member

Hey, could you send some error messages?
Likely RAM is not sufficient

@dream-mjq
Copy link
Author

there is not error message, just stunk in one case data processing. And waiting a few hours still is not continuous.
I saw my RAM is enough. my total's RAW is 64G which just used 30G

@felixkrones
Copy link

Not sure if this is related at all, but I had a very similar problem where my code just got stuck during training in nnUNetTrainer.py in line output = self.network(data).
The issue only occured when I defined a plan (e.g. nnUNetPlannerResEncL) since I got a warning that I was using an old one. It worked fine if I didn't define any plans and just used the code examples as give in the instructions.

@constantinulrich
Copy link
Member

@dream-mjq would it be possible for you to switch to V2?

@felixkrones you are already on V2, right? The warning shouldn't stop your training. I would need more information to help you. Best you open a new issue

@dream-mjq
Copy link
Author

image
image

there are the same problem, when I make a validation and test. the inference was done, but the output file just had 20(all is 300). the program is closed.

@constantinulrich
Copy link
Member

Could you share the data? How many classes do you have?
When having more classes or the image sizes are too big, the resampling of the softmax probabilities to the original data size needs a lot of ram. My gues is still that you might run OOM there.

@dream-mjq
Copy link
Author

I can share the data for you. how can I send to you. there are 31 classes for my dataset.

@dream-mjq
Copy link
Author

Can I add my ram of my server to solve this problem?

@constantinulrich
Copy link
Member

I can share the data for you. how can I send to you. there are 31 classes for my dataset.

Thats quite a lot. Most likely your Ram is the issue and if you have more RAM on a server would switch to that server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants