Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Most of this is directions on paper right now, I'll try to add details here.
The idea is to move away from the current caching and execution model (instantiate every module in persistent pipeline according to signatures, recurse inside the Module themselves) to an external model that drives the execution as needed (this would allow #1060).
Modules would become very dumb execution functions with metadata for the interpreter (we can make them functions instead of classes, I'm thinking about versioning the package API as well for futureproofing).
The multiple levels of interpreters used for groups and subworkflows would disappear, modules would just be able to add more modules to the workflow during execution (which would unify with looping); should address #765. Streaming needs to be built into this, removing the need for passing around generators.
The cache would be used during the execution itself, which would allow more possibility than simply "cacheable" or "notcacheable". A module would output a cache signature along with its output, which would allow modules to be rechecked but still caching its downstream, or different modules to hit the same cache key (e.g. the DownloadFile module could just use a SHA-1 of the file as the cache key). Caching to disk could also be added here in time (#640).
TODO: more details, more code