Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Masking out-of-source-domain data points for nearest neighbour remapping #317

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sol1105
Copy link
Contributor

@sol1105 sol1105 commented Dec 5, 2023

When remapping to a larger domain with the method nearest_s2d, data points outside the original domain will - by the nature of this method - be extrapolated, i. e. have values from the edge of the original domain.

This PR introduces a Regridder option to apply a "domain mask" that is generated by creating remapping weights with the bilinear method as an extra step (while having unmapped_to_nan activated) before creating the nearest neighbour remapping weights. This way, all data points that would count as unmapped for the bilinear method will also be masked for nearest_s2d.

This is useful when remapping data on regional grids such as CORDEX data. A demonstration can be seen here at the bottom of Masking.ipynb:
https://github.com/sol1105/xESMF/blob/domain_mask/doc/notebooks/Masking.ipynb

I added it as an option to the Regridder but the linting suggests that the __init__ method gets too complex with this addition. I could also add this masking option as a separate function or Regridder method. But before putting more work into this I would welcome your general feedback (and if you welcome the general idea of this masking option, also specific feedback on how to preferably deal with the complexity issue).

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a nice addition.

Couple of questions:

  • Does this functionality have to be integrated in the Regridder? Could it be applied on the output by the user, or could we pre-compute the mask and pass the mask values to the Regridder?
  • Is it reasonable to believe there might be other ways to generate a mask? If yes, then I believe it would make more sense to have a string argument for the mask generation method, that takes as input the method name, instead of a boolean.

@sol1105
Copy link
Contributor Author

sol1105 commented Dec 18, 2023

@huard Thanks for your reply. I integrated it in the Regridder since creating the mask requires creating a Regridder object as well and so all the checks related to the dataset were required only once. However, since the Regridder initialisation then fails the linting checks with regards to the added complexity, I will have to move it to a separate method or function. I will try to find a general approach so it will be possible to later add further methods to generate masks.

@huard
Copy link
Contributor

huard commented Dec 18, 2023

Don't worry too much about the linter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants