Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

NJ 5 fails to find coord system where NJ 4.6 does #1108

Open
rschmunk opened this issue Jun 20, 2018 · 20 comments
Open

NJ 5 fails to find coord system where NJ 4.6 does #1108

rschmunk opened this issue Jun 20, 2018 · 20 comments

Comments

@rschmunk
Copy link
Contributor

A user just provided me a sample dataset in which the 5D variable of interest has 1D dimensions of lon, lat, time, level and tau. There are matching coordinate variables for lon, lat and time with appropriate units. Thus...

float clisccp(time=8, tau=7, plev7=7, lat=90, lon=144);

Using netCDF-Java 5.0 (beta2) to query VariableDS.getCoordinateSystems() on this variable, the response is a 0-length array.

However, if I use an old copy of my app based on NJ 4.6.12, a usable coordinate system is returned.

We wondered if this was the result of a change 5.0 branch to enforce dimension ordering following convention, but it seems that except for the additional inserted tau dimension, the ordering is fine.

@lesserwhirls
Copy link
Collaborator

Hi @msdsoftware - it's hard to tell exactly what could be going on here. What class are you using to open the file?

@rschmunk
Copy link
Contributor Author

DatasetUrl durl = DatasetUrl.findDatasetUrl (urlString);
NetcdfDataset result = NetcdfDataset.acquireDataset (durl, true, null);

@lesserwhirls
Copy link
Collaborator

I wonder if this is creating a CoverageDataset object, which has replaced GridDataset...and by replace, I mean is all new, likely has bugs, and you probably just found one :-/

Any chance I could get a sample dataset?

@lesserwhirls
Copy link
Collaborator

Thanks! I'll take a look and see what I can find out.

@rschmunk
Copy link
Contributor Author

@lesserwhirls, Did you have any luck with this?

I just had another user report trouble with a coordinate system not being found, but the new case is a WRF dataset using conformal conic gridding. The only difference btw his "bad" file and a very similar "good" file is that the bad also includes a time dimension, albeit a dimension of length 1.

@rschmunk
Copy link
Contributor Author

rschmunk commented Jul 26, 2018

Ping! @lesserwhirls, I'm debating whether to revert my code to use 4.6.12-snapshot, and would appreciate any update you might have on whether fixing this issue is in the works, and if so when it might reasonably happen. Thx.

@lesserwhirls
Copy link
Collaborator

@msdsoftware - many apologies for the silence. I was out all last week and am still playing catch-up.

Last I looked, I though I had a good idea of where things were going wrong, so my hope is that a fix could happen quickly. However, the WRF file issue seems to be different. Would it be possible to get a copy of that file, too?

It will be quite difficult for me to get to it this week, but I could dedicate Saturday to it. If I was to give you a status update Saturday evening, would that be soon enough for for you to make the call in releasing using 4.6/5.0?

@rschmunk
Copy link
Contributor Author

@lesserwhirls, See
https://www.dropbox.com/s/k5lgmnn2jga6jxa/T2_avghst_C.nc?dl=1 and
https://www.dropbox.com/s/vz9y4ryaa6ozjhy/gfdl_T2_1970-1999_07_avg.nc?dl=1

These two WRF files are almost identical except that in gfdl_T2_1970-1999_07_avg.nc the quantity has the extra time dimension of length 1.

@rschmunk
Copy link
Contributor Author

rschmunk commented Jul 26, 2018

@lesserwhirls, I figured you were on vacation given the lack of activity around the repo. I am about to go on leave myself, so I will be making a decision later today whether to continue with NJ 5 or go back to 4.6 when I post a Panoply bugfix release tomorrow. Other cases people have reported with coordinate systems not being recognized I have been able to code around, but this WRF case is the one I've been unable to do anything with.

EtA: I opted to stick 5.0 for this release. Hopefully, I won't receive (many) more reports of unrecognized/unreported coordinate systems.

@lesserwhirls
Copy link
Collaborator

Ok, I think I understand what is happening here. It's not a case of ordering, but rather completeness. In 5.0, the CoordSysBuilder became a bit more strict. Specifically, the makeCoordinateSystemsMaximal method started requiring the the coordinate systems be complete (that is, a coordinate variable for every dimension must exist) for them to be registered. Here is the magic:

if (!useCompleteCoordSys || csnew.isComplete(ve)) {

Both the check to isComplete() and the bool useCompleteCoordSys (statically set to true) keep coordinate systems from being attached to the variables for these files. A better name for the variable would probably be useCompleteCoordSysOnly. The variable is something newly added in v5 that was not there in v4; the isComplete() method exists in v4, but was not used as part of the makeCoordinateSystemsMaximal method. Because of this change, the individual coordinate axes still get attached to the variable, but are only gathered together and registered as a coordinate system for the variable if they form a complete set.

The first case (1949.clisccp.nc) isn't considered complete due to lack of coordinate variables for the dimensions tau and plev7. The second case isn't considered complete due to the lack of a coordinate variable for the dimension Time.

Thinking out loud here...

Perhaps the idea was to allow the code implementing individual conventions to accept incomplete coordinate systems via extensions to the CoordSysBuilder class? If that's correct, then the right thing to do for the WRF files (second case) is to extend the code in ucar.nc2.dataset.conv.WRFConvention to build a time variable, if one does not exist (would the global attribute START_DATE be the thing to use?). The file in the first case does not appear to be associated with a convention, and there isn't enough info in the file to create coordinate variables for tau and plev7, so a one-off CoordSysBuilder would need to be written for it and its siblings (but I don't see anything to ID by in the global metadata, like a model ID or the likes).

I can see this being a cleaner approach code-wise, while minimizing assumptions for the general case, but certainly does cause some issues (one of which cannot currently be worked around for things that used to work).

Another option would be to break up the Enhance Enum CoordSystems into CoordSystemsComplete and CoordSystemsIncomplete. For backwards compatibility, Enhanse.CoordSystems would alias to setting both of those; however, only CoodSystemsComplete would be set in the defaultEnhanceMode. Then, to allow incomplete coordinate systems, one would need to create a Set<Enum> for use then they call NetcdfDataset that included CoordSystemsIncomplete; for maximum backwards compatibility, you'd probably go with the built-in set of all enhancements using NetcdfDataset.getEnhanceAll.

@rschmunk
Copy link
Contributor Author

@lesserwhirls, I don't know if you've been waiting for a response from me on this, but I guess after a few weeks I probably ought to say something.

Since a fair number of users could be generating their own data files and likely not being strict about following a convention (sometimes I have to tell Panoply users that there are conventions) — or else obtaining and trying to plot from such "carelessly" created datasets — the problem of a coordinate system being "incomplete" is going to be an ongoing problem.

The idea of there being a configurable option in the enhancement settings seems more workable. I am currently using NetcdfDataset.acquireDataset(DatasetUrl, true, CancelTask). I take your suggestion to mean that I might then have to follow that up with NetcdfDataset.enhance(Set<NetcdfDataset.Enhance>) to change the enhancement scheme to not be so restrictive about coordinate system completeness?

@rschmunk
Copy link
Contributor Author

Just got another query from a user about this, @lesserwhirls, in connection to an HDF-EOS dataset. The variables in question are on an auxiliary loa-lat grid, but have an extra dimension for "channel". The vars are reported as having no coordinate system, apparently because there is no coordinate var for channel.

@lesserwhirls
Copy link
Collaborator

Hi @msdsoftware - sorry for the delay. I think I have an enhancement that will allow for the use of incomplete coordinate systems. The code will look something like this:

Set<NetcdfDataset.Enhance> defaultEnhanceMode = NetcdfDataset.getDefaultEnhanceMode();
EnumSet<NetcdfDataset.Enhance> enhanceMode = EnumSet.copyOf(defaultEnhanceMode);
enhanceMode.add(NetcdfDataset.Enhance.IncompleteCoordSystems);
DatasetUrl durl = DatasetUrl.findDatasetUrl (urlString);
NetcdfDataset ncd = NetcdfDataset.acquireDataset(durl, enhanceMode, null);

After digging into this more, I have a feeling that our ucar.nc2.dataset.conv.WRFConvention class isn't doing the right thing when constructing coordinate systems. Of course, that does not account for failures for HDF-EOS, but perhaps that too is a sign that the conventions class for HDF-EOS isn't doing the right thing (looks like we have two of those coordinate system builders). Ultimately, I'd like to get those convention classes working correctly, but for now, this will give a workaround.

I'm currently running our full test suite on my changes, and depending on those results, I should have a fix committed within the next day.

@rschmunk
Copy link
Contributor Author

rschmunk commented Sep 13, 2018

@lesserwhirls, FWIW, I've added some code to my classes to try a little harder looking for lon and lat coordinate vars in swath data when there is no coordinate system reported. It seems to handle this particular HDF-EOS case pretty well. I just need to do some testing against how other sample HDF-EOS files are handled.

@rschmunk
Copy link
Contributor Author

Just compiled from the latest snapshot. Commit #1154 seems to have solved the problem with the recent HDF-EOS example file. I'll have to dig up some of the prior cases and see how it does there.

But if I understand you correctly, @lesserwhirls,, it doesn't work for the GRIB example?

@lesserwhirls
Copy link
Collaborator

Good to hear, @msdsoftware! In terms of GRIB, which example? I'm not seeing a GRIB file here on this issue. I'll take a look once I have a file. Thanks for your help on shaking out these issues!

@rschmunk
Copy link
Contributor Author

Ooops, sorry. I meant the WRF file, @lesserwhirls .

@lesserwhirls
Copy link
Collaborator

Ah, no worries! The WRF file is what I used to test this with, so that should be good to go. My comment about the WRF file is that our coordinate system builder code that handles WRF files is lacking and should be beefed up independently of the the IncompleteCoordSystems enhancement. However, using the new enhancement will at least give you a coordinate system to work with.

@rschmunk
Copy link
Contributor Author

Calling this closed until such time as I receive another bug report about datasets being treated differently by my NJ-using code.

@rschmunk rschmunk reopened this Oct 11, 2018
@rschmunk
Copy link
Contributor Author

Re-opened this as I ran into a problem with an old WRF file using the Lambert conformal grid. Failure pattern is not quite the same as before, however, so I'm trying to make sense of what's happening before I add detail to this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants