Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add latest offset and timestamp to activity stream events #334

Open
Jmgr opened this issue Mar 16, 2021 · 4 comments
Open

Add latest offset and timestamp to activity stream events #334

Jmgr opened this issue Mar 16, 2021 · 4 comments

Comments

@Jmgr
Copy link
Contributor

Jmgr commented Mar 16, 2021

We would like to add the latest partition message offset and timestamp to pause/resume and readonly activity stream events, and were wondering how to best implement that.

raftNode.applyOperation returns an ApplyFuture that has a Response() that could be used to return this information. The partition leader, when applyPauseStream is called, could return the latest message offset and timestamp. The metadata leader publishes activity stream messages though, so this information has to get to it somehow.

Does this seem a good way to implement this feature or is there a simpler solution?

@tylertreat
Copy link
Member

This is actually a bit tricky I think. Apply is only called by the metadata leader to apply a Raft operation. The ApplyFuture returned by Apply returns the response from the metadata leader's Raft FSM. The issue is the metadata leader may not be the partition leader (or even a follower), so it does not have the information needed to return in the future. Like you say, this information has to get to the metadata leader somehow to publish the activity message, but I'm not seeing a good way to do that.

@Jmgr
Copy link
Contributor Author

Jmgr commented Mar 24, 2021

What if we changed the behavior of Pause and SetReadonly API calls to be directed to a partition leader, like it is done for FetchPartitionMetadata? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.

I suppose that this would make the Pause and SetReadonly calls a bit slower as they are now, and this would generate one Raft event per partition instead of only one with a list of partitions. That could be an option when making these calls.

@tylertreat
Copy link
Member

What if we changed the behavior of Pause and SetReadonly API calls to be directed to a partition leader, like it is done for FetchPartitionMetadata? Each partition leader would then be able to a) pause/set readonly and b) add the latest offset and timestamp to the request before c) sending it to the metadata leader using NATS. The metadata leader would then follow the same process as it is doing now.

This could work, though the Pause/SetReadonly behavior needs to occur as a result of the Raft FSM anyway, so the only real reason to direct the request to the partition leader would be so it could send the latest offset/timestamp to the metadata leader. The biggest issue I am seeing with this is that Pause/SetReadonly operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.

Another option would be to have the metadata leader request this information from the partition leaders, but this introduces failure cases. E.g. what do we do if the RPC times out? Thus there would be no guarantee the information would be present.

@Jmgr
Copy link
Contributor Author

Jmgr commented Mar 29, 2021

The biggest issue I am seeing with this is that Pause/SetReadonly operate on multiple partitions, so it's not really feasible to direct the request to the partition leader since there may be multiple leaders.

Couldn't the client send the request to all partition leaders? We could also have a PausePartition and SetPartitionReadonly that only operates on one partition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants