Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using netcdf4 groups in SWALS to combine multidomain outputs into a single file #10

Open
gareth-d-ga opened this issue Feb 24, 2022 · 4 comments

Comments

@gareth-d-ga
Copy link
Collaborator

gareth-d-ga commented Feb 24, 2022

Currently SWALS writes multiple output files for each domain:

  • A netcdf with gridded outputs
  • A netcdf with gauge outputs
  • Some text metadata

For large multidomains with hundreds of domains, this can lead to > 1000 files for a single model run.

  • This can be a problem, e.g., if running hundreds of scenarios then we start approaching file-count limits on some supercomputers (e.g. NCI).

An alternative is to use netcdf groups to combine gauge/grid outputs from each domain in a single netcdf file.

  • This would require updates to SWALS including the post-processing scripts.

A potential downside is that netcdf-groups might not play nicely with some other tools (like ncview). So ideally the netcdf-group format could be implemented as a compile-time option, with post-processing routines seamlessly working with either format.

@gareth-d-ga
Copy link
Collaborator Author

While not related to netcdf directly, pull request #22 has reduced the number of output files.

@gareth-d-ga
Copy link
Collaborator Author

gareth-d-ga commented Jun 22, 2023

While not exactly matching the point above, this can be done in a post-processing step using e.g.

ncecat --gag */Grid*.nc -o test.nc

from inside a multidomain folder.

The resulting file can be viewed with ncview, and read with R's ncdf4 package.

It would be better to improve the group names (to include the domain folder).

The above is at least useful for testing, or, could be an alternative way to manage the file counts. We could also introduce some compression, e.g.

ncecat -4 --deflate 9 --gag */Grid*.nc -o test3.nc

@gareth-d-ga
Copy link
Collaborator Author

ncrename might be able to change groups names

@gareth-d-ga
Copy link
Collaborator Author

gareth-d-ga commented Jun 22, 2023

This version iteratively adds netcdf files into groups, while using group names to reflect the directory structure. Run from the multidomain directory.

for i in RUN*/Grid*.nc; do echo $i; ncecat -4 --deflate 4 --no_tmp_fl -A -M -G $i --gag $i -o test7.nc; done

We can also put grids and gauges in the one file

for i in RUN*/G*.nc; do echo $i; ncecat -4 --deflate 4 --no_tmp_fl -A -M -G $i --gag $i -o test7.nc; done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant