Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline Collisions Occur When They Shouldn't #30

Open
epociask opened this issue Apr 17, 2023 · 0 comments
Open

Pipeline Collisions Occur When They Shouldn't #30

epociask opened this issue Apr 17, 2023 · 0 comments
Assignees
Labels
needs more info type: bug Something isn't working

Comments

@epociask
Copy link
Collaborator

epociask commented Apr 17, 2023

Bug Description

Current DAG implementation will add edges between existing components based on conflicting ComponentID values. However, this means that pipelines that perform backfills can conflict with ones that are live. Same for backtesting pipelines as well.

Example Scenario

Two pipelines are shown below (RP0, RP1), both of which use the same exact components, but require backfilling from some starting heights (x0, x1) where:

x in Z+ and x <= current_block_height

Pessimism - Merging Pipelines (2)

In this example, since RP0.component_set = RP1.component_set, RP1 will be treated as a duplicate pipeline with the EtlManager since their respective pipeline IDs are equal. This will result in RP1 powering some invariant from whatever point in chain history that RP0 is in; resulting in invariants failing to successfully backfill.

It's important to note that in this scenario, x0 could equal x1 but doesn't necessarily have to. The presence of some x for a registerPipeline denotes that it has backfill requirements to be ran. This means that each pipeline state is:
I. Syncing when monotonic x < current_block_height
II. Live once x >= current_block_height

Problem Solution

Introduce access management logic that performs additional integrity checks before reusing an existing component. I,e. The presence of sync flag for a pipeline deems it globally unique where it's components cannot be reused.

In the example of identical pipelines (P0, P1) with the same IDs but backfilling toggles, the pipelines would merge once their states have transitioned (syncing --> live).

@epociask epociask added the type: bug Something isn't working label Apr 17, 2023
@epociask epociask self-assigned this Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more info type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant