Not able to measure the time it take for log from being dispatched to being committed and applied to the FSM #510

dhiaayachi · 2022-06-14T18:22:22Z

Currently raft have multiple metrics to measure the time it take for a log to go through different stages:

raft.commitTime: measure the time it take for log to be written to disk and replicated
raft.fsm.enqueue: measure the time it take for a log to be enqueued to the queue of logs that are to be applied to the FSM
raft.fsm.apply: measure the time it take for a log to be applied to the FSM

Those metrics measure separately different stages of the log in raft but don't account for the time that a log spend in the FSM queue waiting to be applied.

Also, the replication time is aggregated with the disk write time, which make it harder to deduce the reason of a high raft.commitTime. That said, a high raft.commitTime in conjunction with a high raft.LastContact can be safely attributed to the replication taking too long.

I suggest to add 3 metrics:

raft.replicationTime: measure the time it take to replicate the log to enough nodes
raft.applyTime: measure the time from a log being dispatched to after it's applied to the FSM
raft.fsm.queueWait: measure the time from the log get enqueued to the fsm queue to being picked up to be applied to the fsm

The text was updated successfully, but these errors were encountered:

banks · 2022-06-14T19:16:35Z

Also, the replication time is aggregated with the disk write time
We also have raft.rpc.appendEntries.storeLogs which more directly measures disk IO timing and since Consul 1.11 we have raft.boltdb.storeLogs which I think comes from the BoltDB store directly too.

Just adding this for completeness if others come across this.

Overall I like the suggestions here though I have a few questions:

raft.replicationTime: measure the time it take to replicate the log to enough nodes

I'm not quite clear how this is different to the existing commitTime? Is it not including the write to disk on the leader? but that does still include the time taken to write to disk on the followers so it's maybe a little hard to explain the value of that 🤔 . We already have timing for the appendEntries RPC on the followers which is useful for debugging follower slowness and general timing around remote writes. I guess this is a bit different in that it includes time spent queuing on the leader waiting for replication and it would show actual timing it took to get a quorum of responses etc. Not against it as such just wondering how useful and how we could document it so that the semantics are useful to operators?

raft.applyTime: measure the time from a log being dispatched to after it's applied to the FSM

Names are hard... but it seems confusing to have two timers raft.applyTime and raft.fsm.apply 😄 I get that this is a timer for the total time spent blocked in the raft.Apply call by the caller and sits alongside the raft.apply counter... do you think raft.applyTotalTime is any better?

dhiaayachi · 2022-08-08T18:52:56Z

I'm not quite clear how this is different to the existing commitTime? Is it not including the write to disk on the leader? but that does still include the time taken to write to disk on the followers so it's maybe a little hard to explain the value of that 🤔 . We already have timing for the appendEntries RPC on the followers which is useful for debugging follower slowness and general timing around remote writes. I guess this is a bit different in that it includes time spent queuing on the leader waiting for replication and it would show actual timing it took to get a quorum of responses etc. Not against it as such just wondering how useful and how we could document it so that the semantics are useful to operators?

The perspective I had when writing this is to be able to pin point the source of slowness in raft. By adding raft.replicationTime it will help steer the debugging to one of the followers, then we can review the specific metrics for each followers to have a better understanding. Otherwise if this metric is fine that mean that we need to focus on the leader. I would explain it from the leader perspective. Basically, this is the time it take to get a log replicated in enough followers before getting committed.

Names are hard... but it seems confusing to have two timers raft.applyTime and raft.fsm.apply 😄 I get that this is a timer for the total time spent blocked in the raft.Apply call by the caller and sits alongside the raft.apply counter... do you think raft.applyTotalTime is any better?

yes I agree, raft.applyTotalTime is definitely a better name for this. Naming things is hard ...and I'm not good in it 😆

banks · 2022-08-15T10:06:40Z

Thanks Dhia. Happy to approve with that changed name.

Happy to take a look at the follow up PR for Consul's telemetry page where we explain the new metrics too!

ncabatoff added the enhancement label Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to measure the time it take for log from being dispatched to being committed and applied to the FSM #510

Not able to measure the time it take for log from being dispatched to being committed and applied to the FSM #510

dhiaayachi commented Jun 14, 2022

banks commented Jun 14, 2022

dhiaayachi commented Aug 8, 2022

banks commented Aug 15, 2022

Not able to measure the time it take for log from being dispatched to being committed and applied to the FSM #510

Not able to measure the time it take for log from being dispatched to being committed and applied to the FSM #510

Comments

dhiaayachi commented Jun 14, 2022

banks commented Jun 14, 2022

dhiaayachi commented Aug 8, 2022

banks commented Aug 15, 2022