Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have smoothing functions return named DataArrays #3258

Open
sgdecker opened this issue Nov 1, 2023 · 6 comments
Open

Have smoothing functions return named DataArrays #3258

sgdecker opened this issue Nov 1, 2023 · 6 comments
Labels
Area: Xarray Pertains to xarray integration Type: Feature New functionality

Comments

@sgdecker
Copy link
Contributor

sgdecker commented Nov 1, 2023

What should we add?

Currently, functions like smooth_gaussian return DataArrays with no names. This causes problems if you want to combine DataArrays into a Dataset, as in the following example:

import datetime
import xarray as xr
import metpy.calc as mpcalc
from metpy.io import GempakGrid

plot_time = datetime.datetime(2023, 10, 30, 18)

gem_name = '/ldmdata/gempak/model/nam/23103012_nam211.gem'
gem_data = GempakGrid(gem_name)

u = gem_data.gdxarray(parameter='UREL', date_time=plot_time, level=850)[0]
v = gem_data.gdxarray(parameter='VREL', date_time=plot_time, level=850)[0]

# Fails since smooth_gaussian returns anonymous DataArray
#us = mpcalc.smooth_gaussian(u, 8)
#vs = mpcalc.smooth_gaussian(v, 8)
#wind = xr.merge((us, vs))

us = mpcalc.smooth_gaussian(u, 8).rename(u.name + '_gaussian_smoothed')
vs = mpcalc.smooth_gaussian(v, 8).rename(v.name + '_gaussian_smoothed')
wind = xr.merge((us, vs))

My feature request is that functions like this (where the output is of the same nature as the input), when given DataArrays as input, return DataArrays with reasonable names, akin to what the working version of the code does, but internally, so that users need not use .rename themselves.

Reference

No response

@sgdecker sgdecker added the Type: Feature New functionality label Nov 1, 2023
@dopplershift
Copy link
Member

#2369 is another instance of this problem. I think it might be possible to have our xarray handling automatically add a name. The obvious part is including something like '_{func.__name__}'. The question is, how do we leverage the input?

It's obvious in this case, where you could use the name of the 1 xarray input. But what would you do for e.g. vorticity?

@dopplershift dopplershift added the Area: Xarray Pertains to xarray integration label Nov 1, 2023
@sgdecker
Copy link
Contributor Author

sgdecker commented Nov 1, 2023

For vorticity, I would take my "the output is of the same nature as the input" out and do nothing! But I suppose that isn't ideal, so in cases where the output is based on multiple inputs, perhaps not leveraging the input at all is best. Thus, rather than appending '_{func.__name__}' to an existing name, for vorticity the output name would just be '{func.__name__}'. But maybe the issue is the general xarray handling code doesn't know if it is being called from smooth_gaussian vs vorticity?

@dopplershift
Copy link
Member

Actually, that's a really good point that 1 argument should be var_funcname and multiple should be funcname. The good news is that the way the xarray handling is done, this should be do-able as far as I can see (always possible I'm missing something @jthielen @dcamron ?)

One limitation is that this doesn't help us solve #2369, but that doesn't eliminate this approach.

One outstanding question: Do we have any functions that return multiple values that we need to consider how to handle?

@jthielen
Copy link
Collaborator

jthielen commented Nov 1, 2023

Actually, that's a really good point that 1 argument should be var_funcname and multiple should be funcname. The good news is that the way the xarray handling is done, this should be do-able as far as I can see (always possible I'm missing something @jthielen @dcamron ?)

Looking back at the relevant code, this should definitely be possible, but I suspect it may not be the wisest approach as a uniform principle (e.g., not sure we really want saturation_vapor_pressure returning something like temperature_saturation_vapor_pressure). Perhaps we need to add some kind of rename parameter for the decorator, to cover differing sets of desired behavior? That being said, I unfortunately don't see a way of doing this which doesn't add yet another layer of complexity to the xarray handling.

One outstanding question: Do we have any functions that return multiple values that we need to consider how to handle?

I unfortunately can't recall any examples off the top of my head, but the preprocess_and_wrap decorator very much has handling of multiple return values, so I would assume it would need to be addressed somehow.

@dopplershift
Copy link
Member

Looking back at the relevant code, this should definitely be possible, but I suspect it may not be the wisest approach as a uniform principle (e.g., not sure we really want saturation_vapor_pressure returning something like temperature_saturation_vapor_pressure). Perhaps we need to add some kind of rename parameter for the decorator, to cover differing sets of desired behavior?

Curses, foiled again. Your suggestion sounds like a good way to progress. It will result in some unfortunate complexity. I suppose this isn't a uniquely MetPy problem. I assume that's because it's not as common for people to put these back into Datasets.

@sgdecker What's the original use case for wanting the data to be present within a Dataset?

@sgdecker
Copy link
Contributor Author

sgdecker commented Nov 1, 2023

@sgdecker What's the original use case for wanting the data to be present within a Dataset?

The immediate use case here was to create a BarbPlot. A more speculative use case would be code that is in a function to create a "QG-ready" Dataset from a GEMPAK file.

@jthielen 's example of temperature_saturation_vapor_pressure gets at the heart of what "of the same nature" means. I wonder if the rule could be, if the output is not a tuple and the units of the output are the same as the units of the first argument, only then is the function name appended to the existing name. In all other cases, the output name is just the function name alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Xarray Pertains to xarray integration Type: Feature New functionality
Projects
None yet
Development

No branches or pull requests

3 participants