Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit tests for data getters #89

Open
kallewesterling opened this issue Nov 3, 2023 · 2 comments
Open

Add unit tests for data getters #89

kallewesterling opened this issue Nov 3, 2023 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kallewesterling
Copy link
Collaborator

Currently, there are no tests for the data getters (in deepsensor.data.sources).

Suggestion: Add some tests with a very small download for each one source.

This is up for debate: The auxiliary files download a few gigabytes of data so a full unit test for each download is likely not doable. Should we just unit test the era5 and station ones? Or do some other kind of slicing of the downloads?

@kallewesterling kallewesterling added enhancement New feature or request good first issue Good for newcomers labels Nov 3, 2023
@tom-andersson
Copy link
Collaborator

This sounds reasonable, though I’m not 100% convinced about testing the download itself. Assuming we can write fast unit tests for the ERA5 and station getters, these tests would add a dependency on internet connection for the CI and would be flakey if those sources go down.

If we just want to test the interface of the getters and non-download logic then I’d prefer if we added a dry_run option to the getters that rather than running the download, instead generates empty data that looks the same.

@davidwilby any thoughts?

@davidwilby
Copy link
Collaborator

I think that in a test suite it's inadvisable to actually download anything in the routine running of tests (e.g. in CI, or in routine development) for the reasons you mention @tom-andersson but that it's definitely good to test both the downloader and also the source from which downloads are made, in some automated way.

The approach I've taken previously in similar situations is to:

  • include automated tests which use mocking to test the source code itself without actually downloading from the external data source. These tests can be included in the normal test suite.
  • also have a separate set of tests that check that the external data sources are functioning as expected. These tests should be excluded from the normal suite of tests, but can be run explicitly in CI on a schedule, to give us early warning about changes made to external data sources which can break deepsensor functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants