Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update _slice_collection to use toBands and select #91

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

boothmanrylan
Copy link
Contributor

Avoids the expensive call to toList unless the ImageCollection contains more than 5000 images and we are slicing past the 5000th image.

Fixes #88 where xee wasn't recognizing new bands added to existing images or renamed bands.

Avoids the expensive call to toList unless the ImageCollection contains
more than 5000 images and we are slicing past the 5000th image.

Fixes issue where xee wasn't recognizing new bands added to existing
images.
Copy link

google-cla bot commented Nov 2, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

xee/ext.py Outdated
if self.shape[0] <= 5000: # 5000 == max bands in an Image
col_as_image = col.toBands()
return col_as_image.select(selectors)
elif stop < 5000: # 5000 == max bands in an Image
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wont the stride effect this number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Even if the start/stride arguments would make the final image have fewer than 5000 bands, I'm avoiding toList by converting to a multiband image before slicing so I need to grab the first "stop" bands and toBands will fail if it would return an image with more than 5000 bands.

@alxmrs
Copy link
Collaborator

alxmrs commented Nov 4, 2023 via email

recommended by Noel Gorelick as a more efficient method to slice an
ImageCollection
@boothmanrylan
Copy link
Contributor Author

Discussion with Noel Gorelick on Earth Engine Developer forums about a more efficient way to slice an image collection: https://groups.google.com/g/google-earth-engine-developers/c/fSAo9Jn615U

The recommendation was to get a list of system:index, slice that list, then filter the image collection for the remaining system:index

@alxmrs
Copy link
Collaborator

alxmrs commented Nov 7, 2023 via email

@boothmanrylan
Copy link
Contributor Author

We likely need someone more familiar with earth engine's internals than I am to confirm this, but I believe the main differences between system:id and system:index are:

  • system:id is a unique identifier of an object from one of the collections in earth engine's data catalogue while system:index is a unique identifier of an object within the specific context of one collection. This means you can use the system:id to load an image from the data catalogue, but you cannot use the system:index to do this because the system:index is specific to one collection that could be user created.
  • system:id behaves like a regular property (i.e. it can be modified and it isn't guaranteed to exist) while system:index behaves differently (it is automatically created if you create a collection of constant images and it cannot be modified by mapping over a collection)

I think the issue with how system:id was originally being used comes from here:

 col = ee.ImageCollection(imgs)

Because this reloads the raw images from the data catalogue and therefore throws out any modifications that a user made to the collection beforehand.

I'm not sure that its possible to have an image collection without the images all having a unique system:index, but I will add the alternative methods back in as a fail safe because I'm not 100% certain of that.

Relevant gis stackexchange answer: https://gis.stackexchange.com/a/390307
Earth engine script where I was messing around with system:id and system:index: https://code.earthengine.google.com/8eb85fdb77aeb1adca98fe5eaa36246c

@naschmitz
Copy link
Collaborator

@boothmanrylan Could you sync and resolve the conflicts? Then, I can approve and run the integration tests against your PR.

@boothmanrylan
Copy link
Contributor Author

@naschmitz I resolved the conflicts.

The tests are passing for me locally, but failing here on the "Authenticate to Google Cloud" step which I'm not sure how to resolve. If there is something I need to change to get that to work please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

XEE doesn't recognize new bands added to existing images
3 participants