Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--cnk_dmn specifications ignored #198

Open
rkouznetsov opened this issue Jul 1, 2020 · 6 comments
Open

--cnk_dmn specifications ignored #198

rkouznetsov opened this issue Jul 1, 2020 · 6 comments

Comments

@rkouznetsov
Copy link
Contributor

Hi,

When I do
ncks -4 -L5 --cnk_dmn time,1 in.nc out_ncks.nc
on an attached file
in.nc.gz

i get a file with strange chunking

$ ncdump -sc out_ncks.nc |grep nc_O3_gas:_ChunkSizes
		cnc_O3_gas:_ChunkSizes = 18, 185, 308 ;

I would expect it to set chunk size to 1,210,350 ...

ncks is the one from Ubuntu 20.04.
NCO netCDF Operators version 4.9.1 "Skyglow" built by buildd on lgw01-amd64-040 at Mar 23 2020 06:20:36
Also reproducible with
netCDF Operators version 4.9.3
_NCProperties = "version=2,netcdf=4.7.0,hdf5=1.10.4

What would be the right way to force 2D chunks?
Thnak you!

@czender
Copy link
Member

czender commented Jul 1, 2020

I can reproduce this issue with the current distribution. Thanks for reporting it. Will post again when I have made progress.

@czender
Copy link
Member

czender commented Jul 2, 2020

These commands both produce the chunksize=1 for time in all variables, as you desire:

ncks -O -4 -L 1 --cnk_map=nc4 ~/in.nc ~/foo.nc
ncks -O -4 -L 1 --cnk_map=rd1 ~/in.nc ~/foo.nc

The command you tried

ncks -O -4 -L 1 --cnk_dmn time,1 ~/in.nc ~/foo.nc

should also do that for all variables. Apparently it sets a chunksize
of 1 for the time variable, but does not over-ride the default
chunksize in multi-dimensional variables such as cnc_O3_gas.
Those chunksizes default to the cnk_rew map described in the manual.

@rkouznetsov
Copy link
Contributor Author

Thak you! That would help in few of my use cases. I also have files with 4D variables, and I want to have a chunk size of 1 for one record and one non-record dimension, and some smaller chunk of two other dimensions. I have found that for single-precision floats a chunk of 1x1x200x200 already gives quite good compression with good granularity for extracting small subsets. Is there any trick to arrange such chunks with current nco?

@czender
Copy link
Member

czender commented Jul 7, 2020

If you know the desired chunk sizes for each dimension, try specifying them all with four --cnk_dmn options. If that does not work then there is a bug that needs to be addressed.

@rkouznetsov
Copy link
Contributor Author

Thank you! With the above file
ncks -4 -L5 --cnk_dmn time,1 --cnk_dmn lat,200 --cnk_dmn lon,200 in.nc out_ncks.nc
still results in
cnc_O3_gas:_ChunkSizes = 18, 185, 308 ;

Similar issue is reproducible with some files of more dimensions, but I could not easily create an MWE of reasonable size for that. I hope, the above example should be enough to reproduce and hunt down the bug...

@czender
Copy link
Member

czender commented Jul 8, 2020

Thanks, I'll look into this later this summer. For now I suggest you use nccopy -c time/1,... to set those chunksizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants