Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODC feature request: env var for skip_broken_datasets? #1518

Open
robbibt opened this issue Dec 5, 2023 · 2 comments
Open

ODC feature request: env var for skip_broken_datasets? #1518

robbibt opened this issue Dec 5, 2023 · 2 comments

Comments

@robbibt
Copy link
Contributor

robbibt commented Dec 5, 2023

Over the past few months we've been encountering intermittent GDAL access issue semi-regularly. e.g.:

CPLE_OpenFailedError: '/vsis3/dea-public-data/baseline/ga_s2bm_ard_3/52/LBK/2020/08/15/20200815T032931/ga_s2bm_nbart_3-2-1_52LBK_2020-08-15_final_band03.tif' not recognized as a supported file format.

This is a real pain, particularly in automated testing where a random fail can cause us to need to re-run our entire slow test suite.

datacube.load has a handy skip_broken_datasets param that can be used to workaround this issue. However, we don't really want to set this in every notebook/script as it adds complexity and potentially makes workflows non-reproducible.

Thougts on adding support for a global environmental variable (e.g. ODC_SKIP_BROKEN_DATASETS etc) that could be set to force datacube to skip broken datasets, even if this wasn't set in Python code itself? This would allow us to set this in our tests, allowing the tests to be more robust to these issues without impacting user code.

@robbibt
Copy link
Contributor Author

robbibt commented Dec 5, 2023

@SpacemanPaul

@SpacemanPaul SpacemanPaul self-assigned this Dec 5, 2023
@SpacemanPaul
Copy link
Contributor

Probably best handled in the 1.9 branch after #1505 is merged.

Add skip_broken_datasets as a config option, defaulting to False. Will automatically be over-rideable per environment with e.g. ODC_PROD_SKIP_BROKEN_DATASETS and/or ODC_DEV_SKIP_BROKEN_DATASETS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants