Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open_mdsdataset dimension error #316

Open
ruth-moorman opened this issue Nov 28, 2022 · 8 comments
Open

open_mdsdataset dimension error #316

ruth-moorman opened this issue Nov 28, 2022 · 8 comments

Comments

@ruth-moorman
Copy link

Hello!
I'm having an issue loading 2D fields from an LLC270 run.
All 3D variables are loading as expected but the 2D fields are giving the error:
ValueError: dimensions ('time', 'j', 'i') must have the same length as the number of data dimensions, ndim=2

The .meta files one these 2D fields look equivalent to that of other runs I have not had issues loading variables from, e.g.,

 dimList = [
  1080,    1, 1080,
   310,    1,  310
 ];
 dataprec = [ 'float32' ];
 nrecords = [         53 ];
 timeStepNumber = [       8640 ];
 timeInterval = [  7.905600000000E+06  1.036800000000E+07 ];
 missingValue = [ -9.99000000000000E+02 ];
 nFlds = [   53 ];
 fldList = {
 'ETAN    ' 'SIarea  ' 'SIheff  ' 'SIhsnow ' 'SItices ' 'SIhsalt ' 'SIuice  ' 'SIvice  ' 'SHIfwFlx' 'SHIhtFlx' 'SHI_TauX' 'SHI_TauY' 'DETADT2 ' 'PHIBOT  ' 'sIceLoad' 'MXLDEPTH' 'oceSPDep' 'SIatmQnt' 'SIatmFW ' 'oceQnet '
 'oceFWflx' 'oceTAUX ' 'oceTAUY ' 'oceSflux' 'TFLUX   ' 'SFLUX   ' 'EXFtaux ' 'EXFtauy ' 'EXFlwnet' 'EXFswnet' 'EXFswdn ' 'EXFlwdn ' 'EXFqnet ' 'EXFhs   ' 'EXFhl   ' 'EXFevap ' 'EXFpreci' 'EXFatemp' 'SIqnet  ' 'SIqsw   '
 'SIatmQnt' 'SItflux ' 'SIaaflux' 'SIhl    ' 'SIqneto ' 'SIqneti ' 'SIempmr ' 'SIatmFW ' 'SIsnPrcp' 'SIactLHF' 'SIacSubl' 'botTauX ' 'botTauY '
 };
state_2d_set1.0000008640.meta (END)

so I am at a bit of a loss as to the issue. I've checked with the person who generated the data there should be only one timestamp (monthly) per .data file. Could someone help me understand where the dimensions=('time','j','i') information is sourced from and whether there is a workaround that can prevent this clash?

@rabernat
Copy link
Member

Can you share the code you are using to open the data?

@ruth-moorman
Copy link
Author

Sure, it's come up with a few iterations on the basic open_mdsdataset call including just the basic
state_2d = open_mdsdataset(rootdir+'state_2d_set1/')
and including time info, for example,
state_2d = open_mdsdataset(rootdir+'state_2d_set1/',delta_t = 1200, ref_date='1991-12-15 0:0:0)

@lily-dove
Copy link

Hi all, I'm having the same issue. If someone has the solution, I'd appreciate hearing it! :)

@timothyas
Copy link
Member

Hi @ruth-moorman, does it work if you add the arguments geometry="llc", nx=270?

@ruth-moorman
Copy link
Author

Hiya @timothyas sorry for the weird delay here, I ended up not working with that output but am now having the same issue with different output from an LLC540 configuration. Again, the issue is only occurring with 2D variables. In this case I know I should be using geometry = 'curvilinear' and am (and, again, works for 3d variables).

So for example I'm calling:

ds = xmitgcm.open_mdsdataset('../llc540_notides_cycle2/results/diags/', grid_dir = '../llc540_notides_cycle2/results/',prefix = ['state_2d_set1'], geometry='curvilinear',delta_t=480, ref_date = '1993-1-1 0:0:0',iters=iterations[0])

and getting

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 2
      1 # however in this notebook I'm mostly concerned with understanding the bathymetry, so I'll just infile and name an iter0 dataset (kind of a dummy,
----> 2 ds = xmitgcm.open_mdsdataset(llc540_dir, grid_dir = llc540_dir_grid,prefix = ['state_2d_set1'], geometry=geometry,delta_t=delta_t, ref_date = ref_date,iters=iterations[0])
      3 # ds = add_latlon(ds)
      4 # grid = xgcm.Grid(ds, periodic='X')
      5 # ds

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xmitgcm/mds_store.py:273, in open_mdsdataset(data_dir, grid_dir, iters, prefix, read_grid, delta_t, ref_date, calendar, levels, geometry, grid_vars_to_coords, swap_dims, endian, chunks, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, extra_metadata, extra_variables)
    270                 ds = _set_coords(ds)
    271             return ds
--> 273 store = _MDSDataStore(data_dir, grid_dir, iternum, delta_t, read_grid,
    274                       prefix, ref_date, calendar,
    275                       geometry, endian,
    276                       ignore_unknown_vars=ignore_unknown_vars,
    277                       default_dtype=default_dtype,
    278                       nx=nx, ny=ny, nz=nz, llc_method=llc_method,
    279                       levels=levels, extra_metadata=extra_metadata,
    280                      extra_variables=extra_variables)
    282 ds = xr.Dataset.load_store(store)
    283 if swap_dims:

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xmitgcm/mds_store.py:596, in _MDSDataStore.__init__(self, data_dir, grid_dir, iternum, delta_t, read_grid, file_prefixes, ref_date, calendar, geometry, endian, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, levels, extra_metadata, extra_variables)
    593 # Create masks from hFac variables
    594 data = self.calc_masks(vname, data)
--> 596 thisvar = xr.Variable(dims, data, attrs)
    597 self._variables[vname] = thisvar

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xarray/core/variable.py:367, in Variable.__init__(self, dims, data, attrs, encoding, fastpath)
    347 """
    348 Parameters
    349 ----------
   (...)
    364     unrecognized encoding items.
    365 """
    366 self._data = as_compatible_data(data, fastpath=fastpath)
--> 367 self._dims = self._parse_dimensions(dims)
    368 self._attrs = None
    369 self._encoding = None

File ~/miniforge3/envs/pangeo/lib/python3.11/site-packages/xarray/core/variable.py:683, in Variable._parse_dimensions(self, dims)
    681     dims = tuple(dims)
    682 if len(dims) != self.ndim:
--> 683     raise ValueError(
    684         f"dimensions {dims} must have the same length as the "
    685         f"number of data dimensions, ndim={self.ndim}"
    686     )
    687 return dims

ValueError: dimensions ('time', 'j', 'i') must have the same length as the number of data dimensions, ndim=2

@timothyas
Copy link
Member

Hi @ruth-moorman, does it work to either not specify iters, or specify iters=[iterations[0]]? The type specification for the iters argument is a list, so this could be it. That's just a guess though...

@ruth-moorman
Copy link
Author

@timothyas thanks for the suggestion but it doesn't look like it's the iters. iters=iterations[0], iters='all, no iters input, and iters=[iterations[0]] give the same error for the 2D fields. Just stressing in case it helps that I do not get this error with 3D fields for any of those listed values of iters.

i.e. this: xmitgcm.open_mdsdataset(llc540_dir, grid_dir = llc540_dir_grid,prefix = ['layers_3d_set2','fluxes_3d_set1','trsp_3d_set1','state_3d_set1'], geometry=geometry, delta_t=delta_t, ref_date = ref_date,iters=iterations[0])
works totally fine

@timothyas
Copy link
Member

Hi @ruth-moorman, too bad that wasn't the issue. I'm not really sure what's going on. I cannot reproduce the error using the curvilinear_leman dataset in xmitgcm's test suite. If there's any way you can make the data public, I'd be happy to help you out further. I'm also curious how/why you are using a curvilinear geometry with the llc540 geometry - is the entire model domain on just one of the llc faces?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants