Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value Error when using bilinear interpolation method #197

Open
jdldeauna opened this issue Sep 27, 2022 · 8 comments
Open

Value Error when using bilinear interpolation method #197

jdldeauna opened this issue Sep 27, 2022 · 8 comments
Assignees
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@jdldeauna
Copy link

Hi! I was using the bilinear method and kept encountering an error message:

import gcsfs
import xarray as xr
import xesmf as xe

fs = gcsfs.GCSFileSystem(token='anon', access='read_only')
mapper = fs.get_mapper('gs://cmip6/CMIP6/CMIP/MPI-M/MPI-ESM1-2-HR/historical/r1i1p1f1/Omon/thetao/gn/v20190710/')
ds = xr.open_zarr(mapper, consolidated=True)
target_grid = xe.util.grid_global(1,1)
regridder = xe.Regridder(ds, target_grid, 'bilinear')
var_regrid = regridder(ds)

Error:

ValueError: ESMC_FieldRegridStore failed with rc = 506. Please check the log files (named "*ESMF_LogFile").

Fortunately @jbusecke guided me to a solution:

regridder = xe.Regridder(ds, target_grid, 'bilinear', ignore_degenerate=True)

I was wondering if the error message could be edited to reflect that adding the kwarg ignore_degenerate=True resolves it? Its difficult to work out how to resolve the issue with the existing ValueError message. Thanks!

@jbusecke
Copy link

I just wanted to chime in and say that this could be very helpful for the broader community. Since I found the workaround at some point in the past, I have interacted with many folks who experienced this issue.

Also many thanks for raising this issue @jdldeauna!

@sol1105
Copy link
Contributor

sol1105 commented Sep 28, 2022

Hi @jdldeauna ,

this problem is caused by the data having been published on a grid which still contains the grid halo (rows or columns of duplicated grid cells), which is required during the model run. The grid halo cells are often not properly defined and are collapsing to lines or points and are only sharing the cell centers with their original, which then leads to this ValueError in ESMF (see #109).

This problem exists for quite a few CMIP ocean models. What you can usually do (besides using ignore_degenerated=True), is to remove the grid halo before creating remapping weights with xesmf or other tools. Beware that not removing the grid halo can lead to incorrect results when using the conservative remapping method. Also beware that data not defined on the grid cell centers, but rather on the vertices / edges, require special treatment.

Here is an example with CDOs (Climate Data Operators):
https://wiki.mpimet.mpg.de/doku.php?id=analysis:postprocessing_mpiesm:regridding
(here, GR15 corresponds to MPI-ESM1-2-LR MPIOM; TP04 to MPI-ESM1-2-HR MPIOM)

And here an example for xarray, for MPI-ESM1-2-LR/HR (see also #109 or this notebook):

# Define paths to the data
path_mpiesmlr="./tos_Omon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_201001-201412.nc"
path_mpiesmhr="./tos_Omon_MPI-ESM1-2-HR_historical_r1i1p1f1_gn_201001-201412.nc"

# Read the data with halo
ds_mpiesmlr_halo=xr.open_dataset(path_mpiesmlr).isel(time=0)
ds_mpiesmhr_halo=xr.open_dataset(path_mpiesmhr).isel(time=0)

# Read the data without halo (LR - omit first and last column, HR - omit first and last column, first 2 rows)
ds_mpiesmlr=xr.open_dataset(path_mpiesmlr).isel(time=0).isel(i=slice(1, 255))
ds_mpiesmhr=xr.open_dataset(path_mpiesmhr).isel(time=0).isel(i=slice(1, 801), j=slice(2,404))

@jdldeauna
Copy link
Author

Thank you for the comprehensive reply @sol1105 ! Maybe the error message from xesmf could be improved to help users figure out these possible workarounds? Just referring to the log file is confusing esp if users are not too familiar with how xesmf works internally.

@jbusecke
Copy link

jbusecke commented Oct 4, 2022

This is a really helpful explanation @sol1105. I was not actually aware of this.
I have two thoughts:

  1. I think it would be very helpful to document this in the main documention IMO. Happy to help with this and e.g. include a CMIP6 example.2.
  2. Do you by any chance know if it is possible to identify whether halo cells are included via the cf-metadata or in some other automated way for many different models?
    I would like to include such a 'trimming' step (as you show for the MPI model above) in the CMIP6 specific preprocessing over at xMIP to solve this issue for a broader user base.

@huard
Copy link
Contributor

huard commented Nov 23, 2022

@jbusecke Please do submit a PR for this, we should be able to give it a quick review.

@jbusecke
Copy link

jbusecke commented Dec 6, 2022

Happy to work on this next week. Could you assign me to this? Thx

@huard huard added the documentation Improvements or additions to documentation label Dec 6, 2022
@huard huard added this to the v0.7 milestone Dec 6, 2022
@sol1105
Copy link
Contributor

sol1105 commented Dec 13, 2022

@jbusecke

Do you by any chance know if it is possible to identify whether halo cells are included via the cf-metadata or in some other automated way for many different models?

There is no way to infer that information from the metadata. To see whether duplicate cells are present, one would have to look into the cell coordinates, for example:

(from #109)

# 2D latitude and longitude arrays - create an array of (lat,lon) tuples
latlon_halo=np.array(list(zip(ds["latitude"].values.ravel(),ds["longitude"].values.ravel())),
                     dtype=('double,double')).reshape(ds["longitude"].values.shape)

# use numpy.unique to identify unique columns
latlon_no_halo,indices=np.unique(latlon_halo, axis=1, return_index=True)

Alternatively, one could remap an array of ones with the generated weights and see if there are any values > 1 in the resulting array, which would hint at duplicate or at least overlapping cells. If so, adaptive masking can be used when remapping the actual data, which renormalizes the results and thus gets rid of double contributions. If one wants to only renormalize contributions > 1 (i.e. duplicated cells) and not <1 (i.e. cells with contributions from masked or out-of-domain cells), one can remap using ..., skipna=True, na_thres=0.). Maybe this information could be given to the user if duplicated cells are found.

I would like to include such a 'trimming' step (as you show for the MPI model above) in the CMIP6 specific preprocessing over at xMIP to solve this issue for a broader user base.

I did not find a way so far to perform this trimming in an automated way. The problem is, that the halo cells are often improperly defined (with collapsing or wrong bounds), so one has to select "the right" cell to be removed / masked. So probably notifying the user about duplicated cells in his array and his options to either use adaptive masking or manually remove / mask them, would be the best solution?

@jbusecke
Copy link

Thanks for that explanation @sol1105!

@huard huard modified the milestones: v0.7, v0.8 Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants