Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live Replay #2740

Open
bavodaniels opened this issue Jun 8, 2023 · 6 comments
Open

Live Replay #2740

bavodaniels opened this issue Jun 8, 2023 · 6 comments
Assignees
Labels
Status: Under Discussion Use to signal that the issue in question is being discussed. Type: Feature Use to signal an issue is completely new to the project.

Comments

@bavodaniels
Copy link

Feature Description

Make another variant of an event processor which does a replay while still accepting new events

Current Behaviour

Currently when you trigger a replay the event processor goes to event nummer 1 and processes them 1 by 1 until it reaches the tail. While this is happening changes from new events aren't persisted into the QM.

Wanted Behaviour

When triggering a replay on the newly implemented trackingeventprocessor will process old events as a normal replay would do but as soon as new events are added it switches to process those and then returns to process the replay. Or does this at the same time if 2+ threads are present on the processor.

Possible Workarounds

You could mimic the wanted behaviour by having 2 event processors filling in the same QM but one of them has a check in the event handlers to see if its replaying so each new event isn't handled twice.

lets say

  • MyEventProcessor
    • a normal TEP
  • MyReplayingEventProcessor
    • has eventhandlers which check if they are replaying if not they don't do anything

then to trigger a replay you trigger the replay on MyReplayingEventProcessor then the QM is filled in again by replaying and in the meanwhile new events are accepted/persisted in the QM.

Context for this feature

When we did a replay when we had 23 million events it took us 3 months to do this. Now we are at 150 million events.
We duplicated the QM and TEP, did the replay, switched the tables around and then removed the old table
(this replay is on a table which is hard to optimize due to its nature, other tables replay way quicker)

Ps. I might try to pick this up if you deem it useful to be added to axon framework.

@bavodaniels bavodaniels added the Type: Feature Use to signal an issue is completely new to the project. label Jun 8, 2023
@smcvb
Copy link
Member

smcvb commented Jun 12, 2023

First, thanks for filing this feature suggestion with us.

I'm going to share some pointers to think about that may impact the design of the feature:

  • The TrackingToken should maintain both the current and live positions. Perhaps you even want to know the position when the replay started (as is the case with AF right now). That would thus require a TrackingToken implementation containing two to three tokens. Each of these tokens can be of differing types, as this needs to be supported for Gap Aware Tokens (RDBMS event stores), Mongo Tokens, and Global Tokens (Axon Server). Note that a level of nesting comes into play here too, where you perhaps have ReplayTokens with ReplayTokens in them. When building a nesting token containing several tokens, you can check how the ReplayToken and MergeToken fulfill this behavior.
  • Secondly, I assume the logic to switch between old and new events should come from the StreamableMessageSource. Doing so ensure the logic is in one place instead of dispersed over the TrackingEventProcessor and PooledStreamingEventProcessor.
  • Lastly (at least for now), it's good to also think about how to serve the tasks at hand that currently undergo such a "live replay" as you call it. How will queries react? Isn't there a chance users get incorrect data, as the Query Model is within a mixed state of some old events and all new events? In other words, I would like to ask you how your query models behave such that this mixture of events makes sense as query results in a production system. I believe knowing this holds value for others in understanding whether such a feature is valuable.

@smcvb smcvb added the Status: Under Discussion Use to signal that the issue in question is being discussed. label Jun 12, 2023
@bavodaniels
Copy link
Author

On the last point.

What I've been thinking about is also the following
What if you can use a different SequencingPolicy, one where all events for a given row in the QM are all coming in order without having any other event in between then the processor would update the whole row at once and there will be near 0 change of a user querying the QM and getting incorrect data

I'm aware this might open another can of worms and there will probably be a performance impact but some people might choose this

I'm just trying to solve the hassle we have had in the past of

  1. duplicate an eventhandler and its QM in the database
  2. have it finish the replay in the copied QM
  3. remove the old eventhandler and its QM
  4. change over the naming of the new eventhandler and QM

which takes several releases for us

@smcvb
Copy link
Member

smcvb commented Jun 13, 2023

I'm just trying to solve the hassle we have had in the past of

Understandable, @bavodaniels!
Hence why I value the communication on this issue.

Just a thought, but would a nudge that your processors have finished replaying have allowed for a setup that automatically switches the database from the old to the new QM store?
I've developed an (Axon Framework 2) application in the past that did the following process in a production system to update a QM:

  1. The replay is triggered.
  2. The trigger causes the construction of a new table with a unique name. Querying would proceed on the old table, by means of an alias.
  3. Replay would proceed for x amount of time.
  4. We'd be notified when the replay is done and the QM is caught up sufficiently with the new events. On this notification, the alias switched from the old table to the new table. Once successful, the old table was dropped.

@bavodaniels
Copy link
Author

that would work as well

@bavodaniels
Copy link
Author

for my part you can close this feature request unless you see added value to keep it

@smcvb
Copy link
Member

smcvb commented Jun 16, 2023

I think there's still worth discussing this with the team, as perhaps we can do something here.
If that's your suggested solution or the one I highlighted, both will merit adjustments or additional support to make it easier.
As such, I'll leave the issue open.
Nonetheless, thanks for the nudge, @bavodaniels!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Under Discussion Use to signal that the issue in question is being discussed. Type: Feature Use to signal an issue is completely new to the project.
Projects
None yet
Development

No branches or pull requests

2 participants