Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToDo: Refactor SyncedRepo class and its subclasses ModulesRepo and WorkflowRepo #2940

Open
MatthiasZepper opened this issue Apr 28, 2024 · 0 comments
Labels
enhancement infrastructure modules Related to tools for working with DSL2 modules subworkflows

Comments

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Apr 28, 2024

This issue is a reminder that somebody (maybe at the Boston Hackathon?) should look into the three classes and check for unnecessary code duplications?

At the moment, we have two use cases of locally cached Git repositories that are cloned/updated/synced when needed:

  1. The Module repository for the nf-core modules and nf-core subworkflows functionalities.
  2. Workflow repositories if pipelines are downloaded with nf-core download as --platform downloads.

Conceptually, the idea was that most static and utility functions needed could be shared by the two and thus two subclasses ModulesRepo and WorkflowRepo were defined that inherit from their SyncedRepo superclass.

Because both subclasses define their own __init__() functions, it for example went entirely unnoticed that the __init__() function of the SyncedRepo calls self.setup_local_repo(), which is only defined for the subclasses.

Also, the SyncedRepo instances don't have a self.repo attribute, but several associated class methods use it nonetheless. This has not caused noticeable bugs yet, because the superclass is never initiated itself, but only the subclasses, which have a self.setup_local_repo() method that creates the self.repo attribute on the subclass instances.

Apart from making mypy unhappy when touching the old code and trying to commit, this is evidently far from ideal.

Hence, I think it would be good to look into those three classes. The bare minimum would be to sort out the __init__() function of the SyncedRepo class and the missing self.repo attribute, but likely this is only the tip of the iceberg.

So it would likely be required to comprehensively assess how many common and distinct methods each subclass has and either consolidate them better in the superclass or strip the superclass down in favor of the subclasses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement infrastructure modules Related to tools for working with DSL2 modules subworkflows
Projects
Status: To do
Development

No branches or pull requests

1 participant