Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: test optional datasets on every commit #2045

Closed
wants to merge 19 commits into from

Conversation

adamjstewart
Copy link
Collaborator

@adamjstewart adamjstewart commented May 4, 2024

TorchGeo has a number of optional dependencies only needed by certain datasets. Until now, we've only ensured that all tests pass without these dependencies installed on release branches. However, this makes it difficult to get full coverage without relying on ugly import monkeypatching. It's simpler and easier to test both on every commit to get full coverage. We already do this for the minimum version of our dependencies.

Prerequisite for #2043

@adamjstewart adamjstewart added this to the 0.6.0 milestone May 4, 2024
@github-actions github-actions bot added the testing Continuous integration testing label May 4, 2024
@github-actions github-actions bot added the datasets Geospatial or benchmark datasets label May 4, 2024
@@ -14,7 +14,6 @@
class TestUSAVarsDataModule:
@pytest.fixture
def datamodule(self, request: SubRequest) -> USAVarsDataModule:
pytest.importorskip('pandas', minversion='1.1.3')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but this dep is now required

@@ -89,7 +89,7 @@ def test_trainer(
self, monkeypatch: MonkeyPatch, name: str, fast_dev_run: bool
) -> None:
if name.startswith('so2sat') or name == 'quakeset':
pytest.importorskip('h5py', minversion='3')
pytest.importorskip('h5py', minversion='3.6')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forgot to update the minimum deps when dropping Python 3.9 support

monkeypatch.setattr(LandCoverAI, 'sha256', sha256)
match name:
case 'chabud':
pytest.importorskip('h5py', minversion='3.6')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing and caused the release branch tests to fail without this fix

"""
try:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to fail immediately upon instantiation, h5py is only needed for __getitem__

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, without h5py instantiation won't be useful? I'd rather find out immediately that my dataset won't work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does allow you to download/extract the dataset and plot images, just not load images. I get your point though. How long of a delay is there in realizing that the dataset doesn't work if we don't error until getitem?

@@ -182,13 +181,6 @@ def __init__(
self.num_classes = len(self.classes)
self._verify()

try:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't easy to test missing pyvista when laspy is also missing without this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here -- I think the dataset should fail to instantiate if the deps aren't installed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have an idea that will allow us to fail-fast and still get full coverage, will open another prereq PR with just that change so that this PR can be a test-only PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2054

@@ -27,6 +27,7 @@ class TestQuakeSet:
def dataset(
self, monkeypatch: MonkeyPatch, tmp_path: Path, request: SubRequest
) -> QuakeSet:
pytest.importorskip('h5py', minversion='3.6')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing and caused the tests to fail on the release branch.

@@ -127,10 +116,6 @@ def download_url(url: str, root: str, *args: str) -> None:
shutil.copy(url, root)


def test_mock_missing_module(mock_missing_module: None) -> None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests for our unit tests. I love it.

@adamjstewart adamjstewart marked this pull request as ready for review May 4, 2024 19:25
@adamjstewart adamjstewart marked this pull request as draft May 8, 2024 13:37
@adamjstewart adamjstewart removed this from the 0.6.0 milestone May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants