Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete Product CLI Command #1583

Open
wants to merge 6 commits into
base: develop-1.9
Choose a base branch
from
Open

Delete Product CLI Command #1583

wants to merge 6 commits into from

Conversation

Ariana-B
Copy link
Contributor

@Ariana-B Ariana-B commented Apr 24, 2024

Reason for this pull request

Functionality to delete products via the CLI has long been requested.

Proposed changes

  • Add datacube product delete CLI command that deletes the specified products and all related datasets. By default, products with active datasets will not be permitted to be deleted, but this can be overwritten with the --force option.

  • Misc cleanup in surrounding code; mostly minor changes that should have no user impact

  • Mark search_unique_datasets as deprecated since without multiple locations, it doesn't seem to provide anything beyond what search_datasets or search_returning already do.

  • Closes Deleting Products from ODC Index #177

  • Tests added / passed

  • Fully documented, including docs/about/whats_new.rst for all changes

Copy link

codecov bot commented Apr 24, 2024

Codecov Report

Attention: Patch coverage is 95.95960% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 85.84%. Comparing base (773bf08) to head (ed1f931).

Files Patch % Lines
datacube/index/memory/_datasets.py 70.00% 3 Missing ⚠️
datacube/index/postgis/_datasets.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           develop-1.9    #1583      +/-   ##
===============================================
+ Coverage        85.80%   85.84%   +0.04%     
===============================================
  Files              140      140              
  Lines            15402    15469      +67     
===============================================
+ Hits             13215    13279      +64     
- Misses            2187     2190       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -641,6 +641,14 @@ def add_document(self, definition: JsonDict) -> Product:
type_ = self.from_doc(definition)
return self.add(type_)

@abstractmethod
def delete(self, product: Product):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a ODC-beginner, it's hard to figure out if delete is guaranteed to succeed. Is there a reason return types are frequently omitted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The omission of a return type was my mistake, I've updated it to be None, in line with the dataset purge method. Beyond that, is there a more specific return type you would suggest to be more beginner-friendly?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen some surprising types in my drive-bys of the pull requests in these repositories, but my main issue is missing type annotations. As a rule of thumb, input types help me understand what a function is working with/on, and return types generally help me understand if a function is expected to sometimes fail (None/bool) or produces/assembles some kind of data (string/dict/dataclass/etc).

I don't know if it makes sense for opendatacube, but the Ruff guy (Charlie Marsh) wrote about Mypy settings they used when he was working at Spring in a blog post: https://notes.crmarsh.com/using-mypy-in-production-at-spring

Even my own code, which is significantly smaller than opendatacube, doesn't pass with what he was running in large scale production, but if you want the pyproject.toml excerpt recommended without reading his post, here's what I believe the settings are:

[tool.mypy]
namespace_packages = true
disallow_untyped_defs = true
disallow_any_unimported = true
no_implicit_optional = true
check_untyped_defs = true
warn_return_any = true
show_error_codes = true
warn_unused_ignores = true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do run MyPy on this code, and it did pass (hmm). Thanks for the blog link, I'll play around with the settings and see what makes the most sense in a future PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're in the process of systematically typehinting a large complex codebase that has historically had major modules that were either not typehinted at all (or worse, were typehinted incorrectly).

We've come a long way in a relatively short period of time, but there's still a lot of work to be done in this space.

@pjonsson
Copy link

Definitely want this. Will this be backported to 1.8, or will datacube-explorer/datacube-ows work with 1.9 when that is released?

@Ariana-B
Copy link
Contributor Author

Ariana-B commented May 1, 2024

Definitely want this. Will this be backported to 1.8, or will datacube-explorer/datacube-ows work with 1.9 when that is released?

@pjonsson Based on previous gh issues and discussions, it seems like this is best kept to 1.9+ - @SpacemanPaul would probably have deeper insight into this.
Either way, work is currently being done in datacube-ows and datacube-explorer for 1.9 integration.

@SpacemanPaul
Copy link
Contributor

Definitely want this. Will this be backported to 1.8, or will datacube-explorer/datacube-ows work with 1.9 when that is released?

  1. Backporting product delete to 1.8 is not a priority for the core development team (we've been making do without for years and have scripts for it) but we would be happy to consider a PR from an outside contributor (hint, hint).
  2. OWS and Explorer will (eventually) have 1.9-compatible releases. OWS already has a develop-1.9 branch which is under active development. Work on Explorer for 1.9 has not started yet, but is definitely on the cards. However, I am expecting Explorer to be the most challenging ODC package to migrate to 1.9 and I can't offer an estimate of when that work will be complete yet.

Copy link
Contributor

@SpacemanPaul SpacemanPaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I think we should move the logic up from the CLI to the index API - as per comments.

datacube/drivers/postgis/_api.py Show resolved Hide resolved
datacube/drivers/postgres/_api.py Show resolved Hide resolved
datacube/drivers/postgres/_api.py Show resolved Hide resolved

:param product: Product to be deleted
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the signature should be:

delete(self, product: Product, delete_archived_only: bool = True)

raise an exception if there exist archived datasets and delete_archived_only is true.

(Edit: i.e. move this logic up from the CLI)

Add an exception to datacube.index.exceptions - and impose an IndexException superclass while you're there.

Also catch the low level and wrap it in an IndexException if deleting the product (or dataset) throws a database error - rather than just letting the low level exception bubble up as dataset.purge does currently.

datacube/scripts/dataset.py Show resolved Hide resolved
active_product = True

if active_product:
if not force:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Push this logic up to the index API via the delete_archived_only flag discussed above.

@@ -10,6 +10,7 @@ v1.9.next

- Fix typos in docs (:pull:`1577`)
- Merge in 1.8.x branch changes. (:pull:`1568`, :pull:`1579`)
- Add Product delete methods to API and command in CLI, plus misc cleanup of the surrounds (:pull:`1583`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to mention here that user requests for this feature have been around longer than the name of the project.

@pjonsson
Copy link

pjonsson commented May 1, 2024

Backporting product delete to 1.8 is not a priority for the core development team (we've been making do without for years and have scripts for it) but we would be happy to consider a PR from an outside contributor (hint, hint).

I understand your priorities, and I think they are the right ones. My need to delete things is still months/quarters away, so it's possible explorer for 1.9 will be really close and I can put some effort in there to help out, or you are already done with the porting effort and I will be happily using 1.9 by that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants