Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The finding of the optimal radius of influence makes assumption on the ordering of the dimensions in the longitude array #554

Open
adybbroe opened this issue Nov 17, 2023 · 4 comments · May be fixed by #555
Assignees

Comments

@adybbroe
Copy link
Contributor

adybbroe commented Nov 17, 2023

Code Sample, a minimal, complete, and verifiable piece of code

The problem is well illustrated by reading and trying to remap the AWS tesdata:

>>> from satpy import Scene
>>> FILENAMES = ["/home/a000680/data/aws_testdata_from_nigel/W_XX-OHB-Stockholm,SAT,AWS1-MWR-1B-RAD_C_OHB_20230816120142_G_D_20240115111111_20240115125434_T_B____radsim.nc"]
>>> AREAID = 'eurol'
>>> scn = Scene(filenames=FILENAMES, reader='aws_l1b_nc')
>>> scn.load(['1'])
>>> local = scn.resample(AREAID)
>>> local.show('1')
Could not calculate source definition resolution

aws_testdata_ch1_test

See the comments in the Satpy PR here: pytroll/satpy#2565

In the method to determine the optimal radius of influence to be used when reampping data (

def geocentric_resolution(self, ellps='WGS84', radius=None, nadir_factor=2):
) the assumption is that the first dimension on the longitude array dscribes the scanlines/rows of the (e.g. channel) dataset.

The specific code lines that should be adapted in case that self.lons is an xarray.DataArray is this part I believe:

        rows = self.shape[0]
        start_row = rows // 2  # middle row
        src = CRS('+proj=latlong +datum=WGS84')
        if radius:
            dst = CRS("+proj=cart +a={} +b={}".format(radius, radius))
        else:
            dst = CRS("+proj=cart +ellps={}".format(ellps))
        # simply take the first two columns of the middle of the swath
        lons = self.lons[start_row: start_row + 1, :2]
        lats = self.lats[start_row: start_row + 1, :2]

I would propose something like this instead:

        rows = self.lons['y'].shape[0]
        start_row = rows // 2  # middle row
        src = CRS('+proj=latlong +datum=WGS84')
        if radius:
            dst = CRS("+proj=cart +a={} +b={}".format(radius, radius))
        else:
            dst = CRS("+proj=cart +ellps={}".format(ellps))
        # simply take the first two columns of the middle of the swath
        lons = self.lons.sel(y=start_row)[:2]
        lats = self.lats.sel(y=start_row)[:2]

Problem description

[this should also explain why the current behaviour is a problem and why the
expected output is a better solution.]

Expected Output

Actual Result, Traceback if applicable

Versions of Python, package at hand and relevant dependencies

@djhoese
Copy link
Member

djhoese commented Nov 17, 2023

As mentioned in your pull request we can't assume xarray is available in Pyresample. That is standard in Satpy, but not pyresample. If I'm understanding your PR correctly, your swath DataArrays have xarray dimensions ("x", "y")? Is that correct? In that case, I have two possible solutions that I see:

  1. Update the geocentric resolution to take pixels in the middle row first two columns like it is now and the middle column first two rows and take the largest resolution size found.
  2. Update your reader in Satpy to be ("y", "x"). While technically it shouldn't matter, I'm a little scared that this is only one piece of the Satpy/Pyresample puzzle that you're finding an issue with, but that we make the assumption of ("y", "x") in many places.

@adybbroe
Copy link
Contributor Author

Ok, thanks for the comments and suggestions. And, yes, the dimensions are ("x", "y") which is non-standard, I know (see below).

Concerning option 2: Yes, but actually in this case it is the data format which is "wrong", or non-standard. So, for now I rather keep the reader as it is, and wait for the agency (ESA) to fix the format. They have acknowledged the issue at least.

But, as the reader returns an xarray DataArray I thought it would be appropriate to actually try use that information, rather than make assumption on the data layout.

Having said that, I also like your option 1. So, would you prefer:

  1. Keep the suggestion above when the data is an xarray DataArray, and then improve the code for Numpy cases as you propose in your option 1? Or
  2. Skip using the xarray capability and improve as in your option 1, which will will work in all cases?

@djhoese
Copy link
Member

djhoese commented Nov 21, 2023

Concerning option 2: Yes, but actually in this case it is the data format which is "wrong", or non-standard. So, for now I rather keep the reader as it is, and wait for the agency (ESA) to fix the format. They have acknowledged the issue at least.

I don't agree with this, but I'm not doing the coding so I can accept it. Everyone is admitting it is "wrong" (or at least unexpected) and the solution should be a simple .T on the array. Anyway...

Having said that, I also like your option 1. So, would you prefer:

  1. Keep the suggestion above when the data is an xarray DataArray, and then improve the code for Numpy cases as you propose in your option 1? Or
  2. Skip using the xarray capability and improve as in your option 1, which will will work in all cases?

I prefer 2. It (at least theoretically) should only improve the accuracy of this method which is at best a guess. I'd prefer to avoid Xarray workarounds as much as possible, especially when the alternative is "make this functionality work better", but I could also be missing some complexity in this task so I could be convinced otherwise.

@gerritholl
Copy link
Collaborator

gerritholl commented Nov 22, 2023

I agree that option 2 is preferable, and that the reader should be adapted such that the order of dimensions is (y, x). This is the simplest solution that allows other parts of pytroll to stick to the assumption that dimensions have this order, without the need to complicate the code everywhere where that assumption is currently made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants