Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can run interactive demo but can't train: Segmentation fault (core dumped) #64

Open
jbmaxwell opened this issue Oct 28, 2020 · 6 comments

Comments

@jbmaxwell
Copy link

I've been trying to train on MNIST (I have custom data that's MNIST-like) but keep hitting Segmentation fault (core dumped).

tensorflow-gpu = 1.15
pytorch = 1.4.0

I have doreblopy installed.

Out of curiosity I ran the interactive demo, which works fine.

@hri98mahesh
Copy link

Im also having trouble when training on MNIST. If you are able to then please tell me how to overcome the issues

@jbmaxwell
Copy link
Author

Unfortunately I gave up (at least for now).

@podgorskiy
Copy link
Owner

@jbmaxwell, can you please describe what steps you did? Did you generate tf records for your dataset? Did you adjust the yaml config accordinly?
It could be just that the paths are wrong. While I tried to make Dareblopy verbose, it still may crash (segmentation fault) if something is wrong.
You can replace Dataset implementation with your own in dataloader.py, without dareblopy. It was done maingly for performance reasons.

@uhiu
Copy link

uhiu commented Dec 5, 2020

I think it's caused by dareblopy and you can't debug it, unfortunately. As the author said above, usually it's a problem of data preparation. Any time you adjust the config file, no matter related to the dataset or not, e.g. changing the batch size, try generating the tfrecords file again. hope it helps:)

@podgorskiy
Copy link
Owner

@uhiu ,

Any time you adjust the config file, no matter related to the dataset or not, e.g. changing the batch size, try generating the tfrecords file again. hope it helps:)

Well, it should not depend on the batch size, that's for sure.

I improved dareblopy considerably in v0.0.5.
I tried many scenarios of a wrong usage, all of them result in a python exception with a detailed description of the problem, not segfault like before.

Even in case of a crash, it should print some minimal crash log with call stack to help to investigate the problem.

@jbmaxwell ,
Could you please do pip install dareblopy --upgrade and try again? Please make sure that you have v0.0.5.
What platform do you use?

@ennauata
Copy link

ennauata commented Jul 7, 2022

Exactly same issue here.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants