parallelize field_transform #38

martinvandriel · 2015-09-29T09:42:19Z

Once going to very large databases (~10TB), the field_transform becomes a serious bottleneck. While computation can be done in a few hours, serial field transform takes multiple days to weeks (serial read/write rates on parallel file systems are really bad, about 30 MB/s on CSCS machines).

Workload should be limited, because this refers to a single loop with very few lines

https://github.com/geodynamics/axisem/blob/master/SOLVER/UTILS/field_transform.F90#L851-L909

martinvandriel · 2015-10-01T09:45:47Z

Parallelization might not work together with compression.

martinvandriel · 2015-10-01T09:48:24Z

https://www.hdfgroup.org/hdf5-quest.html#p5comp

sstaehler · 2015-10-01T15:22:14Z

Right, that was one of the reasons not to try parallel NetCDF some years ago. Switching the compression off is probably not a problem for kernel applications, where the wave fields are never moved, but for all applications, where databases are sent to IRIS or whoever, this is a problem.

martinvandriel · 2015-10-01T15:25:42Z

The only way I can think of to avoid this: use the old round robin IO and write the correct chunking readily compressed in the SOLVER :/

sstaehler · 2015-10-01T15:27:26Z

but didn't we test that writing the correct chunking in the SOLVER is a bazillion times slower?

martinvandriel · 2015-10-01T15:30:49Z

yes, but there might be room to optimize it. Only include those processors, that actually have to write stuff, threading, reduce number of dumps by buffering as many steps as possible in memory.

I am waiting since a week for field_tranform on a 10TB database, which I computed in a few hours, and it's only 30% done.

sstaehler · 2015-10-01T15:35:11Z

Well, that is annoying. When trying to increase the dump buffer, keep in mind the low memory on most HPC machines. But I'm curious...

martinvandriel · 2015-10-01T15:37:40Z

I guess we would need to control, which part of the mesh goes where on the cluster: if each node only has one processor that has crust, it might fit larger time chunks.

martinvandriel · 2015-10-12T10:27:37Z

So here we go: system maintenance and field transform was killed. We should at least have a restart capability. This should be really easy to implement.

martinvandriel added the enhancement label Sep 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelize field_transform #38

parallelize field_transform #38

martinvandriel commented Sep 29, 2015

martinvandriel commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

martinvandriel commented Oct 12, 2015

parallelize field_transform #38

parallelize field_transform #38

Comments

martinvandriel commented Sep 29, 2015

martinvandriel commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

sstaehler commented Oct 1, 2015

martinvandriel commented Oct 1, 2015

martinvandriel commented Oct 12, 2015