ODC feature request: env var for `skip_broken_datasets`? #1518

robbibt · 2023-12-05T05:05:56Z

Over the past few months we've been encountering intermittent GDAL access issue semi-regularly. e.g.:

CPLE_OpenFailedError: '/vsis3/dea-public-data/baseline/ga_s2bm_ard_3/52/LBK/2020/08/15/20200815T032931/ga_s2bm_nbart_3-2-1_52LBK_2020-08-15_final_band03.tif' not recognized as a supported file format.

This is a real pain, particularly in automated testing where a random fail can cause us to need to re-run our entire slow test suite.

datacube.load has a handy skip_broken_datasets param that can be used to workaround this issue. However, we don't really want to set this in every notebook/script as it adds complexity and potentially makes workflows non-reproducible.

Thougts on adding support for a global environmental variable (e.g. ODC_SKIP_BROKEN_DATASETS etc) that could be set to force datacube to skip broken datasets, even if this wasn't set in Python code itself? This would allow us to set this in our tests, allowing the tests to be more robust to these issues without impacting user code.

The text was updated successfully, but these errors were encountered:

robbibt · 2023-12-05T05:06:38Z

@SpacemanPaul

SpacemanPaul · 2023-12-05T05:11:06Z

Probably best handled in the 1.9 branch after #1505 is merged.

Add skip_broken_datasets as a config option, defaulting to False. Will automatically be over-rideable per environment with e.g. ODC_PROD_SKIP_BROKEN_DATASETS and/or ODC_DEV_SKIP_BROKEN_DATASETS.

robbibt added enhancement documentation labels Dec 5, 2023

SpacemanPaul self-assigned this Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ODC feature request: env var for `skip_broken_datasets`? #1518

ODC feature request: env var for `skip_broken_datasets`? #1518

robbibt commented Dec 5, 2023 •

edited

robbibt commented Dec 5, 2023

SpacemanPaul commented Dec 5, 2023

ODC feature request: env var for skip_broken_datasets? #1518

ODC feature request: env var for skip_broken_datasets? #1518

Comments

robbibt commented Dec 5, 2023 • edited

robbibt commented Dec 5, 2023

SpacemanPaul commented Dec 5, 2023

ODC feature request: env var for `skip_broken_datasets`? #1518

ODC feature request: env var for `skip_broken_datasets`? #1518

robbibt commented Dec 5, 2023 •

edited