Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] t.stac.import: Add STAC API import functionality #802

Draft
wants to merge 35 commits into
base: grass8
Choose a base branch
from

Conversation

cwhite911
Copy link
Contributor

I'm implementing STAC API import functionality into GRASS using pystac-client.

Currently, I'm modeling the module off of the STAC API request parameters:

  • collections – List of one or more Collection IDs
  • ids – List of one or more Item ids to filter on.
  • limit – A recommendation to the service as to the number of items to return per page of results. Defaults to 100.
  • max_items – The maximum number of items to return from the search, even if there are more matching results.
  • bbox – A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be filtered to only those intersecting the bounding box.
  • intersects – A string or dictionary representing a GeoJSON geometry, or an object that implements a geo_interface property, as supported by several libraries including Shapely, ArcPy, PySAL, and geojson. Results filtered to only those intersecting the geometry.
  • datatime – Either a single datetime or datetime range used to filter results.
  • query – List or JSON of query parameters as per the STAC API query extension
  • filter – JSON of query parameters as per the STAC API filter extension
  • filter_lang – Language variant used in the filter body. If filter is a dictionary or not provided, defaults to ‘cql2-json’. If filter is a string, defaults to cql2-text.

Details docs at https://pystac-client.readthedocs.io

I'm thinking we should use the current computational region as the default bbox to filter the request and provide an option to define a raster or vector to use as the intersects parameter.

For outputs, we have a few options.

  1. Import all items individually.
  2. Import a single patched raster.
  3. Import as a STRDS.

Another option, instead of having a single module r.in.stac, we could split it into multiple modules and include specific STAC extension parameters in each:

  • r.in.stac
  • i.in.stac
  • v.in.stac

Thoughts...

@cwhite911 cwhite911 changed the title [WIP] r.in.stac: Add STAC API import funtionality to GRASS [WIP] r.in.stac: Add STAC API import funtionality Sep 30, 2022
@cwhite911 cwhite911 changed the title [WIP] r.in.stac: Add STAC API import funtionality [WIP] r.in.stac: Add STAC API import functionality Sep 30, 2022
@metzm
Copy link
Contributor

metzm commented Oct 4, 2022

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

@veroandreo
Copy link
Contributor

veroandreo commented Oct 4, 2022

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

What about simplifying all into:

  • t.rast.in.stac or t.in.stac.rast
  • t.vect.in.stac or t.in.stac.vect

or a toolset t.in.stac with raster and vector options as submodules?

STAC stands for spatio-temporal after all, and that's the beauty of it, no? In any case, big 👍 for this!!

Copy link
Member

@ninsbl ninsbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Good to see you are working on STAC.

I have been working on a module to import space-time data in NetCDF format here:
#794

Maybe you can get some inspiration and pick useful code snippets. If so, we may consider moving some of them into a library...

# % guisection: Request
# %end

# %option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn` t that the same as the "bbox" option?

# %option G_OPT_V_INPUT
# % key: intersects
# % description: Results filtered to only those intersecting the geometry.
# % guisection: Request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other predefined options? If so, it would be good to provide them in an options list, if "intersects" it the only valid option, implementing this as a flag for intersection might be a simpler choice

@ninsbl
Copy link
Member

ninsbl commented Oct 5, 2022

When importing parts of a space-time collection, the corresponding output would be a GRASS space-time dataset, created with a t.* module, e.g. t.in.stac.

What about simplifying all into:

  • t.rast.in.stac or t.in.stac.rast
  • t.vect.in.stac or t.in.stac.vect

or a toolset t.in.stac with raster and vector options as submodules?

STAC stands for spatio-temporal after all, and that's the beauty of it, no? In any case, big 👍 for this!!

Agreed. As for module naming, there is t.rast.import and t.vect.import in core, so it might be more consistent with existing approaches to name modules t.rast.import.stac and t.vect.import.stac (though three dots are not standard either)...

P.S.: Sorry @veroandreo for accidentally editing your comment...

@cwhite911
Copy link
Contributor Author

Thanks @veroandreo and @ninsbl for the feedback. We could also call the module t.stac.import with a raster or vector option that uses the sub-modules t.in.stac.rast or t.in.stac.vec in the background.

Another thing to consider is the t.out.stac. I've started a proposal for a GRASS STAC extension using the nc_spm_08 sample dataset. My thought is that I would like to be able to describe a grassdata directory as a STAC Catalog that contains Location STAC Collections that contain mapset STAC collections that have assets. This would allow sharing GRASS datasets as explorable metadata that can be viewed in any STAC Viewer or OpenPlains.

Here are some initial tests, but I'm still working on the first spec and want to get wider community feedback once the initial use cases are flushed out.

grassdata: STAC Catalog

{
    "type": "Catalog",
    "id": "grassdata",
    "stac_version": "1.0.0",
    "description": "GRASS GIS STAC data catalog used by OpenPlains.",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "child",
        "href": "https://example.com/grass_catalog/nc_spm_08/collection.json",
        "type": "application/json",
        "title": "nc_spm_08"
      },
      {
        "rel": "self",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      }
    ],
    "stac_extensions": []
  }

nc_spm_08 location: STAC Collection

{
    "type": "Collection",
    "id": "nc_spm_08",
    "stac_version": "1.0.0",
    "description": "GRASS GIS Sample Datasets",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "child",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      }
    ],
    "stac_extensions": [
      "https://stac-extensions.github.io/projection/v1.0.0/schema.json",
      "https://stac-extensions.github.io/scientific/v1.0.0/schema.json"
    ],
    "grass:type": "location",
    "proj:epsg": 3358,
    "sci:citation": "GRASS Development Team, 2022. Geographic Resources Analysis Support System (GRASS) Software, Version 8.0. Open Source Geospatial Foundation. https://grass.osgeo.org",
    "title": "nc_spm_08",
    "extent": {
      "bbox": [
        [
          33.83,
          -84.33,
          36.59,
          -75.38
        ]
      ]
    },
    "license": "GNU General Public License (GPL)",
    "keywords": [
      "GRASS GIS",
      "Location"
    ]
  }

PERMANENT mapset: STAC Collection

{
    "type": "Collection",
    "id": "PERMANENT",
    "stac_version": "1.0.0",
    "description": "defualt mapset",
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "item",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/elevation/elevation.json",
        "type": "application/json"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/nc_spm_08/collection.json",
        "type": "application/json",
        "title": "nc_spm_08"
      }
    ],
    "stac_extensions": [
      "https://stac-extensions.github.io/projection/v1.0.0/schema.json",
      "https://stac-extensions.github.io/scientific/v1.0.0/schema.json"
    ],
    "grass:type": "mapset",
    "proj:epsg": 3358,
    "sci:citation": "GRASS Development Team, 2022. Geographic Resources Analysis Support System (GRASS) Software, Version 8.0. Open Source Geospatial Foundation. https://grass.osgeo.org",
    "title": "PERMANENT",
    "extent": {
      "bbox": [
        [
          33.83,
          -84.33,
          36.59,
          -75.38
        ]
      ]
    },
    "license": "GNU General Public License (GPL)",
    "keywords": [
      "GRASS GIS",
      "mapset",
      "PERMANENT"
    ]
 }

Raster data: STAC Item

{
    "type": "Feature",
    "stac_version": "1.0.0",
    "id": "elevation",
    "properties": {
      "title": "\"South-West Wake county: Elevation NED 10m\"",
      "description": "\"generated by r.proj\"",
      "proj:epsg": 3358,
      "grass:datatype": "FCELL",
      "grass:comments": "\"r.proj input=\"ned03arcsec\" location=\"northcarolina_latlong\" mapset=\"\\helena\" output=\"elev_ned10m\" method=\"cubic\" resolution=10\"",
      "grass:creator": "\"helena\"",
      "grass:ewres": "10",
      "grass:nsres": "10",
      "grass:cols": "1500",
      "grass:location": "nc_spm_08",
      "grass:mapset": "PERMANENT",
      "grass:map": "elevation",
      "grass:maptype": "raster",
      "grass:min": "55.57879",
      "grass:max": "156.3299",
      "grass:ncats": "255",
      "grass:semantic_label": "\"none\"",
      "grass:source1": "\"\"",
      "grass:source2": "\"\"",
      "datetime": "2006-07-11T01:09:51Z"
    },
    "geometry": {
      "type": "Polygon",
      "coordinates": [
        [
          [
            645000.0,
            215000.0
          ],
          [
            645000.0,
            228500.0
          ],
          [
            630000.0,
            228500.0
          ],
          [
            630000.0,
            215000.0
          ],
          [
            645000.0,
            215000.0
          ]
        ]
      ]
    },
    "links": [
      {
        "rel": "root",
        "href": "https://example.com/grass_catalog/catalog.json",
        "type": "application/json"
      },
      {
        "rel": "collection",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      },
      {
        "rel": "parent",
        "href": "https://example.com/grass_catalog/nc_spm_08/PERMANENT/collection.json",
        "type": "application/json",
        "title": "PERMANENT"
      }
    ],
    "assets": {
      "raster": {
        "href": "/api/v3/locations/nc_spm_08/mapsets/PERMANENT/raster_layers/elevation",
        "type": "image/tiff; application=geotiff; profile=cloud-optimized",
        "title": "\"South-West Wake county: Elevation NED 10m\"",
        "roles": [
          "data"
        ]
      },
      "thumbnail": {
        "href": "/api/v3/locations/nc_spm_08/mapsets/PERMANENT/raster_layers/elevationrender",
        "type": "image/png",
        "title": "\"South-West Wake county: Elevation NED 10m\" Thumbnail",
        "roles": [
          "thumbnail"
        ]
      }
    },
    "bbox": [
        630000,
        215000,
        645000,
        228500
    ],
    "stac_extensions": [],
    "collection": "PERMANENT"
}

@lucadelu
Copy link
Contributor

@cwhite911 do you have any updates on this?

This module could be really useful

@veroandreo
Copy link
Contributor

@cwhite911 any plans to continue with this soon(ish)? Now that ESA will change data delivery, having access via STAC would be really relevant for many

@wenzeslaus
Copy link
Member

Please preserve your shell and Git history. I want to see what happened to change the instructions accordingly.

@cwhite911 cwhite911 changed the title [WIP] r.in.stac: Add STAC API import functionality [WIP] t.stac.import: Add STAC API import functionality Sep 19, 2023
Comment on lines 255 to 259
avaliable_collections_ids = [c.id for c in list(avaliable_collections)]

gs.warning(_(f"Avaliable Collections: {avaliable_collections_ids}"))

if collections in avaliable_collections_ids:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in various lines:

  • avaliable -> available
  • Avaliable -> Available


**t.stac.import** utilizes the
[pystac-client (v0.5.1)](https://github.com/stac-utils/pystac-client) to search
STAC APIs and import and import items into GRASS GIS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove one of the two "and import"

if collections in avaliable_collections:
avaliable_collections_ids = [c.id for c in list(avaliable_collections)]

gs.warning(_(f"Avaliable Collections: {avaliable_collections_ids}"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think translatable messages should not be f-strings, but rather something like this:

Suggested change
gs.warning(_(f"Avaliable Collections: {avaliable_collections_ids}"))
gs.warning(_("Avaliable Collections: {}").format(avaliable_collections_ids))

@ninsbl
Copy link
Member

ninsbl commented Sep 19, 2023

Really good to see this moving forward, especially with: https://dataspace.copernicus.eu/ providing Sentinel data in STAC format...

@neteler
Copy link
Member

neteler commented Sep 19, 2023

This PR needs a big rebase...

@cwhite911
Copy link
Contributor Author

This PR needs a big rebase...

Fixed the issue

@wenzeslaus
Copy link
Member

wenzeslaus commented Sep 19, 2023

Now, I finally understood (thanks @cwhite911!) how this thing with many commits happens. You have outdated branch. You decide to update it to the base branch (here grass8) by rebase. You do that. Then you push. You get a message that the remote branch and local branch diverged and that the push is not possible. The suggestion is to update you local branch. You decide to update the local branch from the remote one by rebase. And that's where the mess happens. All the perfectly fine commits from grass8 branch which are now on your local branch get removed and then re-applied on top of the latest commit on the remote branch. Then you happily push and then see all these extra commits duplicating changes already on the base branch.

The right operation after rebasing the local branch to the base branch is to force push. You changed all the commits on the local branch and that's what you want on the remote one too. Force push is the right operation here because you want to replace remote branch with what you have locally.

Git does not know that's what you are doing. It sees different commit hashes and it gives you advice which would preserve all these commits.

It is worth noting that, unlike merge, rebase changes the commit hashes. So, even the same change after a rebase, has a different hash, so it looks like a different commit to Git.

In light of Git giving the "wrong advice" in the rebase workflow, it might a good idea to use merge in all contributor workflows. I mean to have it in the instructions. There is no reason not to use rebase in general. One issue with this is terminology, merge can by misleading as it is also used in "merge a PR".

Sorry, for an off-topic post here, I'll follow up on this in a new issue or PR.

src/temporal/t.stac.import/t.stac.import.py Outdated Show resolved Hide resolved
src/temporal/t.stac.import/t.stac.import.py Outdated Show resolved Hide resolved
src/temporal/t.stac.import/t.stac.import.py Outdated Show resolved Hide resolved
src/temporal/t.stac.import/t.stac.import.py Outdated Show resolved Hide resolved
cwhite911 and others added 10 commits March 28, 2024 22:57
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants