Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequential_distribute: Subdivide mesh :Distribute issue #218

Open
Girishchandra-Yendargaye opened this issue Dec 4, 2019 · 5 comments
Open

Comments

@Girishchandra-Yendargaye

Distribution is failing..
sequential_distribute: Subdivide mesh
Error! ***Memory allocation failed for TRINODALMETIS: nind.

Why does distribute require additional memory

@stoiver
Copy link
Member

stoiver commented Dec 4, 2019

@Girishchandra-Yendargaye The initial creation of the distributed domain is memory heavy. A sequential domain is created on processor 0 and then we run the structure through metis to partition the structure (again just on processor 0). This essentially doubles the memory. And then the partitions are communicated to the other processors.

What can be done if you are going to be running a number of simulations with the same underlying domain, is to do the partitioning once on a large memory machine or computational node, and then use that distributed structure for all subsequent simulations.

@ggiannako
Copy link

Apologies for the intervention (and for not using the users-list), but I have a follow-up question:
After generating and partitioning the mesh into N structures using one processor, how can we save the individual structures and read them later using N processors?

@stoiver
Copy link
Member

stoiver commented Dec 4, 2019

@Girishchandra-Yendargaye Have a look at the code in the folder anuga/simulation and anuga/parallel/parallel_api.py There is code in there using sequential_distribute to do the creation of the the distribution without necessarily running the evolve.

@Girishchandra-Yendargaye
Copy link
Author

@stoiver Thank you sir for reply I am running only pickling part still it is failing...Below is code
domain = create_domain_from_file(meshfile_name)
print "domain created!"
domain.set_quantity('elevation',numeric=5.0, #triangle_elevation
use_cache=cache,
verbose=True,
alpha=0.1,location='vertices')
print "quantities Set "
domain.set_store(True)
print "Domain Store set!"
sequential_distribute_dump(domain, 2000, verbose=True)
print "process completed!"

####Output is
domain created!
quantities Set
Domain name set!
Domain Store set!
sequential_distribute: Subdivide mesh
Error! ***Memory allocation failed for TRINODALMETIS: nind. Requested size: -1302189028

Can you please suggest something.

@Girishchandra-Yendargaye
Copy link
Author

Girishchandra-Yendargaye commented Jan 10, 2020

@stoiver One more issue while sequential_distribute
39742 Segmentation fault

Even if memory is available it is failing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants