Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement transaction action plugins #13842

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jaimergp
Copy link
Contributor

Description

Closes #13795

Checklist - did you ...

  • Add a file to the news directory (using the template) for the next release's release notes?
  • Add / update necessary tests?
  • Add / update outdated documentation?

@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Apr 24, 2024
@jaimergp
Copy link
Contributor Author

Currently deciding where to add hooks. I started by adding a single generic CondaTransactionAction, but that API assumes very little about when it will be called and with which parameters.

The meat happens in UnlinkLinkTransaction. Roughly we have something like:

  • Actions called per package record
  • Actions called per path individually (each file and directory)
  • Actions called for several paths in bulk
  • Actions called per environment

These are duplicated for unlink & link. So we have something like eight possible hooks:

  • Unlink package record hook
  • Unlink individual path hook
  • Unlink multiple paths hook
  • Unlink environment hook
  • Link package record hook
  • Link individual path hook
  • Link multiple paths hook
  • Link environment hook

The API is split between Actions (unaware of the transaction order) and ActionGroups (which are directly handled by the transaction).

I'll need to prototype a bit more to see how this looks like.

@jaimergp
Copy link
Contributor Author

The other alternative is to forget about the Action API and simply add a series of pre/post hooks in each part of the LinkUnlinkTransaction API:

  • pre_prepare, post_prepare
  • pre_verify, post_verify
  • pre_execute, post_execute
  • pre_reverse, post_reverse

Still eight hooks, which don't guarantee execution order, and would force plugins to inspect the UnlinkLinkTransaction instance to find the objects they need, which means more potential for API breakage.

@@ -332,3 +333,48 @@ def conda_settings():
aliases=("example_option_alias",),
)
"""

@_hookspec
def conda_transaction_action(self) -> Iterable[CondaTransactionAction]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per pluggy parlance hooks should be plural since they are implemented not as singular instances but always as a list, or in our case generators as we're using yield



@dataclass
class CondaTransactionAction:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

action action 💥

@jaimergp
Copy link
Contributor Author

jaimergp commented May 3, 2024

Ay further comments here @jezdez (or anyone from @conda/conda-core)? If not I think I'll proceed with the solution described at #13842 (comment).

@jezdez
Copy link
Member

jezdez commented May 3, 2024

Ay further comments here @jezdez (or anyone from @conda/conda-core)? If not I think I'll proceed with the solution described at #13842 (comment).

Still eight hooks, which don't guarantee execution order, and would force plugins to inspect the UnlinkLinkTransaction instance to find the objects they need, which means more potential for API breakage.

That's going to set in stone the un-API that UnlinkLinkTransaction is, which isn't really about just unlinking IIUC. UnlinkLinkTransaction is totally named wrong, it's in fact just the Transaction, capital T.

The tricky thing is that we're trying extend different API levels, that are stacked and have different constraints, some related to the transaction order, some not. That's going to make things a lot harder for plugin authors to understand how to build reasonable plugins without breaking the API.

We don't neccearily need a pre and post step for every action (3. below), e.g. calling a hook plugin for every file that is being handled sounds risky in terms of performance for example. While calling a plugin with the list of files that the action is about to be taken, that might not be as problematic.

There are three level all weirdly intertwined:

  1. We derive the first level from the values the transation is handling on a high level:
  • CondaTransaction(Pre|Post)Unlink (gets list of package records)
  • CondaTransaction(Pre|Post)Link (gets list of package records)
  • CondaTransaction(Pre|Post)Removal (gets list of matchspec)
  • CondaTransaction(Pre|Post)Update (gets list of matchspec)

Those are just for data preperation, I wonder if those could also be done with pre and post solver plugins?

The other alternative is to forget about the Action API and simply add a series of pre/post hooks in each part of the LinkUnlinkTransaction API:

  • pre_prepare, post_prepare
  • pre_verify, post_verify
  • pre_execute, post_execute
  • pre_reverse, post_reverse

Still eight hooks, which don't guarantee execution order, and would force plugins to inspect the UnlinkLinkTransaction instance to find the objects they need, which means more potential for API breakage.

  1. That's the second level, which will all get a transaction instance passed (previously called "LinkUnlinkTransaction".

  2. And then in addition we port the (Path)Action type to being a plugin based interface, we already have specific plugins built-in (that can't be overwritten): "unlink" , "remove_menus", "unregister", "link", "entry_point", "compile", "make_menus", "record", "register", each could live in a separate module (conda.transaction.actions.*), implement seperate phases of the transaction and plugin authors could add aditional actions.

@jaimergp
Copy link
Contributor Author

jaimergp commented May 9, 2024

Thanks for the detailed reply @jezdez! I need to think a little bit more about it, but for now:

Those are just for data preperation, I wonder if those could also be done with pre and post solver plugins?

The limitation I see there is that the post-solve hook is run before the IO work has started. For example, directories might not be in place yet, nor the artifacts have been downloaded, etc. For example, if I want to write some files, I could anticipate where to write it (but we don't have any info about the target prefix the solver was passed!). However, if for some reason the transaction fails (bad download, disk full, etc), there won't be any rollbacks for the post-solve actions. This hook is for me more useful to prevent the transaction from happening, before IO work begins.

So in a nutshell, what do you think it's the sweet spot here? I think we can forget about your level (1), focus on number (2), and then at some point tackle (3) in a refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed [bot] added once the contributor has signed the CLA
Projects
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

Plugin system: transaction action plugins
3 participants