Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.5.4-SNAPSHOT]: Checksum32 throwing checksum exception accessing var in H5 file #1199

Open
1 task done
rschmunk opened this issue Jun 13, 2023 · 2 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@rschmunk
Copy link
Contributor

rschmunk commented Jun 13, 2023

Versions impacted by the bug

v5.x

What went wrong?

A Panoply user working on a new institutional data product discovered that depending on what version of the app he used, he might not or might be able to extract a variable from an HDF-5 dataset. When there was a failure, a "Checksum invalid" exception was percolating up from ucar.nc2.filter.Checksum32.java.

On further examination by the user, he found that depending on what version of nccopy he used to create a test dataset, the problem might or might not occur. If he used nccopy 4.7.4, the problem did not occur; if he used nccopy 4.8.1, then the problem occurs. He further noted that in the former case, the H5 dataset was written with superblock version 0, while in the latter it was superblock 2.

In further examination using different versions of Panoply which used differing versions of netcdfAll 5.5.4 snapshots, I found that a copy of Panoply generated in late September last year (using NJ 5.5.3) was able to open the problem dataset and extract the desired variable. A copy generated the end of October (using NJ 5.5.4 snapshot 10/24) was unable to do so.

I note that in between, on Sep. 30 last year, a couple of commits were made to Checksum32.java.

Following is a stack trace showing where the error occurs. This was generated using an 5.5.4 snapshot just downloaded this afternoon.

Relevant stack trace

Exception doing slice: java.lang.RuntimeException: Checksum invalid
java.lang.RuntimeException: Checksum invalid
	at ucar.nc2.filter.Checksum32.decode(Checksum32.java:82)
	at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB$DataChunk.getByteBuffer(H5tiledLayoutBB.java:235)
	at ucar.nc2.iosp.LayoutBBTiled.hasNext(LayoutBBTiled.java:101)
	at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.hasNext(H5tiledLayoutBB.java:143)
	at ucar.nc2.iosp.IospHelper.readData(IospHelper.java:380)
	at ucar.nc2.iosp.IospHelper.readDataFill(IospHelper.java:292)
	at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:230)
	at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:204)
	at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
	at ucar.nc2.Variable.reallyRead(Variable.java:797)
	at ucar.nc2.Variable._read(Variable.java:736)
	at ucar.nc2.Variable.read(Variable.java:614)
	at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:461)
	at ucar.nc2.dataset.VariableDS._read(VariableDS.java:434)
	at ucar.nc2.dataset.VariableDS._read(VariableDS.java:444)
	at ucar.nc2.Variable.read(Variable.java:600)
	at ucar.nc2.Variable.read(Variable.java:546)
	at gov.nasa.giss.data.nc.array.NcArray2D.doSlice(NcArray2D.java:417)

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed?
Note: the test datasets are publicly accessible without restriction.

Yes

Code of Conduct

  • I agree to follow the UCAR/Unidata Code of Conduct
@rschmunk rschmunk added the bug Something isn't working label Jun 13, 2023
@rschmunk
Copy link
Contributor Author

rschmunk commented Jun 13, 2023

Sample datasets include:

Problem dataset created using nccopy 4.8.1

Non-problem dataset created using nccopy 4.7.4

There is only one variable in both datasets. I understand data in the variable array is junk. The issue is whether NJ will even provide access to that variable's data.

@haileyajohnson
Copy link
Member

I will take a look at what's happening with the problem dataset, but note that in prior versions of netcdf-java (<5.5.4) we didn't actually check hdf5 checksums, we just threw them out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants