Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating composites using static background require double resampling when generate=False is used #2733

Open
pnuu opened this issue Jan 30, 2024 · 14 comments

Comments

@pnuu
Copy link
Member

pnuu commented Jan 30, 2024

Describe the bug
I'm trying to create the geo_color composite using FCI data and the built-in recipe. For some reason the Scene needs to be resampled twice for it to be saveable.

EDIT:
The key to the problem is using generate=False when composite(s) using data with different projections are loaded. Such as static background image with any satellite data.

To Reproduce

import glob

import hdf5plugin
from satpy import Scene

FNAMES = glob.glob("/home/lahtinep/data/satellite/geo/fci/*C_0088*nc")


def main():
    """Run."""
    comps = ["geo_color"]
    glbl = Scene(reader='fci_l1c_nc', filenames=FNAMES)
    glbl.load(comps, generate=False)
    lcl = glbl.resample("euro4", resampler="gradient_search", reduce_data=False)
    # lcl2 = lcl.resample("euro4", resampler="gradient_search", reduce_data=False)
    # lcl2.save_datasets(
    lcl.save_datasets(
        filename='/tmp/{start_time:%Y%m%d_%H%M}_MTG-I1_euro4_{name}.tif',
        tiled=True, blockxsize=512, blockysize=512,
        driver='COG', overviews=[])


if __name__ == "__main__":
    main()

Expected behavior
The image should be saved after a single resampling.

Actual results

RuntimeError: None of the requested datasets have been generated or could not be loaded. Requested composite inputs may need to have matching dimensions (eg. through resampling).

Environment Info:

  • OS: Linux
  • Satpy Version: current main branch
  • PyResample Version: current main branch

Additional context
If the lcl2 lines are taken to use, and thus the scene is resampled twice, the product is generated correctly.

Same problem is seen if I add night_ir_with_background composite (with sub-composites) to fci.yaml.

@pnuu
Copy link
Member Author

pnuu commented Jan 30, 2024

It seems that setting generate=True makes the composite to work. This also means a significant performance hit when producing multiple composites sharing channels (instead of single resampling per channel, all RGBs are resampled separately).

@pnuu
Copy link
Member Author

pnuu commented Jan 30, 2024

This doesn't seem to help: lcl = glbl.resample("euro4", resampler="gradient_search", reduce_data=False, generate=True)

@pnuu
Copy link
Member Author

pnuu commented Jan 30, 2024

Neither does lcl.generate_possible_composites(True) before saving, which outputs The following datasets were not created and may require resampling to be generated: DataID(name='geo_color').

@djhoese
Copy link
Member

djhoese commented Jan 30, 2024

Just for record keeping, can you print out glbl.keys() when generate=False? Then can you print out .attrs["area"] for everything in lcl.values() and compare? Or maybe do set([x.attrs["area"] for x in lcl.values()]) and print that out.

@pnuu
Copy link
Member Author

pnuu commented Jan 31, 2024

For the glbl Scene:

ipdb> for k in glbl.keys(): print(k)
DataID(name='ir_105', wavelength=WavelengthRange(min=9.8, central=10.5, max=11.2, unit='µm'), resolution=1000, calibration=<2>, modifiers=())
DataID(name='ir_38', wavelength=WavelengthRange(min=3.4, central=3.8, max=4.2, unit='µm'), resolution=1000, calibration=<2>, modifiers=())
DataID(name='vis_04', wavelength=WavelengthRange(min=0.384, central=0.444, max=0.504, unit='µm'), resolution=1000, calibration=<1>, modifiers=())
DataID(name='vis_05', wavelength=WavelengthRange(min=0.47, central=0.51, max=0.55, unit='µm'), resolution=1000, calibration=<1>, modifiers=())
DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<1>, modifiers=())
DataID(name='vis_08', wavelength=WavelengthRange(min=0.815, central=0.865, max=0.915, unit='µm'), resolution=1000, calibration=<1>, modifiers=())

ipdb> for v in set([x.attrs["area"] for x in glbl.values()]): print(v); print("")
Area ID: mtg_fci_fdss_1km
Description: MTG FCI Full Disk Scanning Service area definition with 1 km resolution
Projection: {'ellps': 'WGS84', 'h': '35786400', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11136
Number of rows: 11136
Area extent: (-5567999.9986, 5567999.9986, 5567999.9986, -5567999.9986)

Area ID: mtg_fci_fdss_500m
Description: MTG FCI Full Disk Scanning Service area definition with 500 m resolution
Projection: {'ellps': 'WGS84', 'h': '35786400', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 22272
Number of rows: 22272
Area extent: (-5567999.9996, 5567999.9996, 5567999.9996, -5567999.9996)

and for lcl Scene:

ipdb> for k in lcl.keys(): print(k)
DataID(name='_geo_color_low_clouds_dep_0', resolution=1000)
DataID(name='_geo_color_low_clouds_dep_2')
DataID(name='_night_background_hires')
DataID(name='geo_color_high_clouds', resolution=1000)
DataID(name='ir_105', wavelength=WavelengthRange(min=9.8, central=10.5, max=11.2, unit='µm'), resolution=1000, calibration=<2>, modifiers=())
DataID(name='true_color', resolution=500)
ipdb> for v in set([x.attrs["area"] for x in lcl.values()]): print(v); print("")
 
Area ID: None
Description: None
Projection ID: WGS 84
Projection: {'datum': 'WGS84', 'no_defs': 'None', 'proj': 'longlat', 'type': 'crs'}
Number of columns: 13500
Number of rows: 6750
Area extent: (-180.0, -90.0, 180.0, 90.0)

Area ID: euro4
Description: Euro 4km area - Europe
Projection: {'ellps': 'bessel', 'lat_0': '90', 'lat_ts': '60', 'lon_0': '14', 'no_defs': 'None', 'proj': 'stere', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 1024
Number of rows: 1024
Area extent: (-2717181.7305, -5571048.1403, 1378818.2695, -1475048.1403)

So the static background isn't loaded with generate=False.

For generate=True it is (via _night_background_hires sub-composite):

ipdb> for k in glbl.keys(): print(k)
DataID(name='_geo_color_low_clouds_dep_0', resolution=1000)
DataID(name='_geo_color_low_clouds_dep_2')
DataID(name='_night_background_hires')
DataID(name='geo_color_high_clouds', resolution=1000)
DataID(name='ir_105', wavelength=WavelengthRange(min=9.8, central=10.5, max=11.2, unit='µm'), resolution=1000, calibration=<2>, modifiers=())
DataID(name='vis_04', wavelength=WavelengthRange(min=0.384, central=0.444, max=0.504, unit='µm'), resolution=1000, calibration=<1>, modifiers=('sunz_corrected',))
DataID(name='vis_05', wavelength=WavelengthRange(min=0.47, central=0.51, max=0.55, unit='µm'), resolution=1000, calibration=<1>, modifiers=('sunz_corrected',))
DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<1>, modifiers=('sunz_corrected',))
DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<1>, modifiers=('sunz_corrected', 'rayleigh_corrected'))
DataID(name='vis_08', wavelength=WavelengthRange(min=0.815, central=0.865, max=0.915, unit='µm'), resolution=1000, calibration=<1>, modifiers=('sunz_corrected',))

ipdb> for v in set([x.attrs["area"] for x in glbl.values()]): print(v); print("")
Area ID: mtg_fci_fdss_1km
Description: MTG FCI Full Disk Scanning Service area definition with 1 km resolution
Projection: {'ellps': 'WGS84', 'h': '35786400', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11136
Number of rows: 11136
Area extent: (-5567999.9986, 5567999.9986, 5567999.9986, -5567999.9986)

Area ID: None
Description: None
Projection ID: WGS 84
Projection: {'datum': 'WGS84', 'no_defs': 'None', 'proj': 'longlat', 'type': 'crs'}
Number of columns: 13500
Number of rows: 6750
Area extent: (-180.0, -90.0, 180.0, 90.0)

Area ID: mtg_fci_fdss_500m
Description: MTG FCI Full Disk Scanning Service area definition with 500 m resolution
Projection: {'ellps': 'WGS84', 'h': '35786400', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 22272
Number of rows: 22272
Area extent: (-5567999.9996, 5567999.9996, 5567999.9996, -5567999.9996)

@pnuu pnuu changed the title Creating geo_color composite for FCI requires double resampling Creating composites using static background require double resampling when generate=False is used Jan 31, 2024
@pnuu
Copy link
Member Author

pnuu commented Jan 31, 2024

I adjusted the title and added the short summary what triggers the behaviour to the description.

@pnuu
Copy link
Member Author

pnuu commented Jan 31, 2024

I tested by pre-resampling the background image to FCI projection and euro4, neither of which helped and a second resampling step is still required when generate=False is used.

@pnuu
Copy link
Member Author

pnuu commented Jan 31, 2024

Above I forgot to also resample the land-sea mask. If both background image and LSM are resampled to the final image projection, the composite generation works with a single resample() call even with scn.load(..., generate=False). I can use this as a background, but it'd be great to find out what causes this behaviour.

@yukaribbba
Copy link
Contributor

Do you still have this problem with my #2696 ? I don't know if this is a compositor issue or something else...

@pnuu
Copy link
Member Author

pnuu commented Jan 31, 2024

@yukaribbba yes, with #2696 I still need to do douple resampling when using generate=False.

I think the problem is somewhere in the decision tree in a case where the IncompatibleAreas is raised by a sub-composite.

@djhoese
Copy link
Member

djhoese commented Jan 31, 2024

Good discovery @pnuu!

Ok, this is me debugging and brainstorming:

  1. StaticCompositor nodes shouldn't have any dependencies so they are treated like this:

    if not prereqs:
    # this composite has no required prerequisites
    prereq_nodes.append(self.empty_node)
    self.add_child(parent, self.empty_node)
    return prereq_nodes, unknown_datasets

    The idea is that even though the static compositor doesn't have any dependencies we need to add a fake empty node as a dependency. Otherwise this compositor will get treated like a reader-loaded dataset when we start loading things by doing dependency_tree.leaves() and dependency_tree.trunk(). That is, composites and modified datasets are trunk nodes, reader datasets are leaves.

  2. Trunk nodes aren't accessed anywhere in the Scene for loading except for here (ignore "available_composite_ids", etc):

    satpy/satpy/scene.py

    Lines 1532 to 1537 in f11db54

    def _generate_composites_from_loaded_datasets(self):
    """Compute all the composites contained in `requirements`."""
    trunk_nodes = self._dependency_tree.trunk(limit_nodes_to=self.missing_datasets,
    limit_children_to=self._datasets.keys())
    needed_comp_nodes = set(self._filter_loaded_datasets_from_trunk_nodes(trunk_nodes))
    return self._generate_composites_nodes_from_loaded_datasets(needed_comp_nodes)

    But this isn't called when generate=False.

  3. By saying generate=False, you're basically saying "don't look at composites at all" as an optimization essentially.

So the questions/choices are:

  • Can we fix this by making static compositor nodes (or composites with no dependencies) leaf nodes in the dependency tree and load them like reader datasets?
  • Does doing the above remove one of the expectations and intentions of users by using the generate=False keyword argument?
  • The above solution would also require changes to available_X methods as they assume trunk nodes are composites and leaf nodes are from the reader. Gets a little confusing to explain to users what we mean if we start mixing these up.
  • Alternatively, should we (the Scene) look for trunk nodes with no real dependencies even when generate=False?
  • Other solutions?

@pnuu
Copy link
Member Author

pnuu commented Feb 15, 2024

* Other solutions?

I think the easiest route, at least for my own worklfow, is to pre-resample the background images (black marble and land-sea mask) to my target projection and adjust the composite configs in $SATPY_CONFIG_PATH/composites/<instrument>.yaml to point url to the resampled versions. This means I'll need a separate variant for each target area, but for me that's ok since outside of testing I pretty much only ever use one or two areas.

@simonrp84
Copy link
Member

FWIW, I have encountered this problem just now with one of my own composites...BUT, the static image background is on the same projection / area as the rest of the data.

@pnuu
Copy link
Member Author

pnuu commented Feb 24, 2024

@simonrp84 so scn[chan].attrs['area'] == scn2['image'].attrs['area'] is True?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants