The STAC transform should allow items with no Proj information #297

alexgleith · 2021-08-08T11:27:36Z

In ODC, the shape and transform (grid) information is optional. We should be able to handle assets without shape and transform, but currently we can't.

Errors look like this:

Traceback (most recent call last):
  File "C:\Users\cesar\anaconda3\envs\cubeenv\lib\site-packages\odc\apps\dc_tools\fs_to_dc.py", line 58, in cli
    metadata = stac_transform(metadata)
  File "C:\Users\cesar\anaconda3\envs\cubeenv\lib\site-packages\odc\index\stac.py", line 272, in stac_transform
    proj_transform=proj_transform,
  File "C:\Users\cesar\anaconda3\envs\cubeenv\lib\site-packages\odc\index\stac.py", line 130, in _get_stac_bands
    grid = f"g{transform[0]:g}m"
TypeError: 'NoneType' object is not subscriptable

Kirill888 · 2021-09-03T01:25:21Z

Problem Description

EO3 metadata format expects one to define native projection information for each band. Unlike STAC, EO3 expects an extra indirection layer: each band is assigned to some named grid (shared across several bands), this way if 10 bands share common grid (footprint), then grid information is recorded once. One of the grids must be called "default". Bands that belong to "default" grid can omit grid specification as "default" grid is implied. This also means that a common case of "all bands in the dataset share the same footprint and resolution" has minimal textual representation in EO3 format.

Each "grid" is basically shape and transform tuple, that in combination with shared CRS fully define a shared footprint of the bands belonging to that grid. The four image corners 0,0; W,0; W,H; 0,H are mapped via linear transform to give footprint information. The transform also encodes native resolution of the image.

The "essential" information needed by dc.load is "footprint", while "native resolution" and "native projection" information are "nice to haves". One can still search for and load data without knowing up front in what projection pixels are stored, so long as bounding box is accurate enough (fully encloses valid data of the Dataset while being tight).

Proposed Solution

When native projection and resolution data is not available but bounding box or a geometry are defined we can produce a "fake" default grid. Such grid would have CRS: "EPSG:4326", shape: [1,1] and transform: computed in such a way that 0,0 -> 1,1 square maps to a bounding box of the dataset in lon,lat.

Only concern is that native_geobox called on such a Dataset will report valid geobox even though it shouldn't. But that could be a simple fix to detect shape==(1,1) and report missing data instead in this case, I do not expect a valid case for 1x1 pixel data, so using that as a marker for "information not available" is acceptable in my view.

cc: @gadomski

Kirill888 · 2021-09-03T01:29:50Z

EPSG:4326 is somewhat annoying to deal with though, and is a least tested configuration in datacube, so maybe it's worth supporting customization of "default" CRS by the user.

Kirill888 added this to Planned in STAC.load Aug 15, 2021

This was referenced Sep 3, 2021

Make STAC -> ODC conversion more robust to missing extension data #351

Merged

Task: unify odc.stac.transform with odc.stac._eo3 #355

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The STAC transform should allow items with no Proj information #297

The STAC transform should allow items with no Proj information #297

alexgleith commented Aug 8, 2021

Kirill888 commented Sep 3, 2021

Kirill888 commented Sep 3, 2021

The STAC transform should allow items with no Proj information #297

The STAC transform should allow items with no Proj information #297

Comments

alexgleith commented Aug 8, 2021

Kirill888 commented Sep 3, 2021

Problem Description

Proposed Solution

Kirill888 commented Sep 3, 2021