Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening a MODIS dataset brings seemingly random values across, rest are 0 #122

Open
scheip opened this issue Jan 5, 2024 · 5 comments
Open
Labels
bug Something isn't working MODIS Issues with MODIS data P1

Comments

@scheip
Copy link

scheip commented Jan 5, 2024

Hi folks, cannot explain my appreciation and excitement for xee, what a great package. I have used this successfully with other datasets (e.g., ERA-5) but am now trying to work with MODIS data. I am experiencing a problem where every time I use xr.open_dataset... I get a unique result. Code example:

# specify study area
ur = [35.89, -81.87] # y, x
ll = [35.83, -81.945] # y, x   # full region
coords = [ll[1], ll[0], ll[1], ur[0], ur[1], ur[0], ur[1], ll[0]]
geom = ee.Geometry.Polygon(coords)

# Specify image collections
ic = ee.ImageCollection("MODIS/061/MCD15A3H").filterDate('2016-06-01', '2024-01-01').filterBounds(geom).select('Lai')

ds = xr.open_dataset(
    ic,
    engine='ee',
    crs='EPSG:32617',
    scale=500,
    geometry=geom
)

# plot lai, average across x and y
ds['Lai'].mean(dim=['X', 'Y']).plot()

If I run this code 5 times, I get 5 different time-series plots. In every case, a substantial portion of the data is 0. Here are 3 example outputs of running the above code. Why would it be different every time?

I am currently on 0.0.5 of xee:
xee 0.0.5 pyhd8ed1ab_0 conda-forge
Thank you for any pointers.

run1
run2
run3

@scheip
Copy link
Author

scheip commented Jan 5, 2024

I can confirm I do not see this behavior with another EE dataset. For example, the only change in the below is switching over to an ERA5 dataset:

# specify study area
ur = [35.89, -81.87] # y, x
ll = [35.83, -81.945] # y, x   # full region
coords = [ll[1], ll[0], ll[1], ur[0], ur[1], ur[0], ur[1], ll[0]]
geom = ee.Geometry.Polygon(coords)

# Specify image collections
ic = ee.ImageCollection("ECMWF/ERA5_LAND/MONTHLY_AGGR").filterDate('2016-06-01', '2024-01-01').filterBounds(geom).select('temperature_2m')

ds = xr.open_dataset(
    ic,
    engine='ee',
    crs='EPSG:32617', # utm zone 17n
    scale=500,
    geometry=geom
)

# plot lai, average across x and y
ds['temperature_2m'].mean(dim=['X', 'Y']).plot()

And this yields the same result every time, as expected.

@noahgolmant
Copy link

Is there a mismatch between the bounds CRS and the requested CRS of the dataset? I am not exactly sure why the issue is non-deterministic as a result. I ran this on my own test area, Pará, Brazil, and observed similar zero values until I adjusted this. When I assume EPSG:4326 for your bounds, I think the boundary is in Antarctica, so the LAI is 0 as expected.

With the original code:

import ee
import geopandas as gpd
from shapely.geometry import mapping
import xarray as xr
import xee
from ingest.consts import SERVICE_PATH, EMAIL

credentials = ee.ServiceAccountCredentials(EMAIL, SERVICE_PATH)
ee.Initialize(credentials)

bounds = gpd.read_file("/home/noah/ingest/configs/data/acre_tiny_test_site.geojson").iloc[0].geometry
geom = ee.Geometry(mapping(bounds))

# Specify image collections
ic = ee.ImageCollection("MODIS/061/MCD15A3H").filterDate('2016-06-01', '2024-01-01').filterBounds(geom).select('Lai')

ds = xr.open_dataset(
    ic,
    engine='ee',
    crs='EPSG:32617',
    scale=500,
    geometry=geom
)

# plot lai, average across x and y
ds['Lai'].mean(dim=['X', 'Y']).plot()
image

With the following variant, setting CRS to 4326, we see no zeros and expected seasonal variation:

# Specify image collections
ic = ee.ImageCollection("MODIS/061/MCD15A3H").filterDate('2016-06-01', '2024-01-01').filterBounds(geom).select('Lai')

ds = xr.open_dataset(
    ic,
    engine='ee',
    crs='EPSG:4326',
    geometry=geom
)

# plot lai, average across x and y
ds['Lai'].mean(dim=['lon', 'lat']).plot()
image

@scheip
Copy link
Author

scheip commented Jan 10, 2024

Thank you @noahgolmant for trying this out! I don't know what happened, perhaps an update on GEE's side, but the code I listed in my first message now produces expected results, with no zeros and no code changes. Very strange!

I'd be curious if you re-ran your first code block if you also now get expected results?

@scheip scheip closed this as completed Jan 10, 2024
@scheip scheip reopened this Jan 30, 2024
@scheip
Copy link
Author

scheip commented Jan 30, 2024

Strange - running the same troubleshooting code pasted above again is resulting in spurious/missing values. Any further thoughts on this?

Copying the troubleshooting code here:

# specify study area
ur = [35.89, -81.87] # y, x
ll = [35.83, -81.945] # y, x   # full region
coords = [ll[1], ll[0], ll[1], ur[0], ur[1], ur[0], ur[1], ll[0]]
geom = ee.Geometry.Polygon(coords)

# Specify image collections
ic = ee.ImageCollection("MODIS/061/MCD15A3H").filterDate('2016-06-01', '2024-01-01').filterBounds(geom).select('Lai')

ds = xr.open_dataset(
    ic,
    engine='ee',
    crs='EPSG:32617',
    scale=500,
    geometry=geom
)

# plot lai, average across x and y
ds['Lai'].mean(dim=['X', 'Y']).plot()

@jdbcode jdbcode added bug Something isn't working MODIS Issues with MODIS data labels Jan 30, 2024
@naschmitz
Copy link
Collaborator

Sounds like this is related to #119. We're looking into it.

@naschmitz naschmitz added the P1 label Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working MODIS Issues with MODIS data P1
Projects
None yet
Development

No branches or pull requests

4 participants