[WIP] Add cluster benchmark #315

Fullstop000 · 2019-11-01T04:41:48Z

Part of #109

This PR tries to add a benchmark for the real raft node cluster. The communication between nodes is supported by mspc::channel which brings some overhead.

@Hoverbear @hicqu

Problems

~~Criterion complains about that benchmark time will be too long. Consider using raw benching.~~ Official #[bench] is an unstable feature.
Reduce the overhead in benchmark

Signed-off-by: Fullstop000 fullstop1005@gmail.com

Signed-off-by: Fullstop000 <fullstop1005@gmail.com>

Hoverbear · 2019-11-02T14:40:20Z

Sample output:

PS C:\Users\Hoverbear\Git\raft-rs> cargo bench Raft::cluster
   Compiling raft v0.6.0-alpha (C:\Users\Hoverbear\Git\raft-rs)
    Finished release [optimized] target(s) in 11.61s
     Running target\release\deps\raft-18f353cfeaa7c747.exe

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 50 filtered out

     Running target\release\deps\benches-fd626a4f421dffb5.exe
Gnuplot not found, disabling plotting
Benchmarking Raft::cluster/1: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.1s or reduce sample count to 10
Raft::cluster/1         time:   [32.257 ms 33.791 ms 35.351 ms]
                        thrpt:  [28.288   B/s 29.594   B/s 31.001   B/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low severe
  1 (10.00%) high mild
Benchmarking Raft::cluster/32: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 57.9s or reduce sample count to 10
Raft::cluster/32        time:   [30.280 ms 31.254 ms 32.484 ms]
                        thrpt:  [985.09   B/s 1023.9   B/s 1.0320 KiB/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low severe
  1 (10.00%) high severe
Benchmarking Raft::cluster/128: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.6s or reduce sample count to 10
Raft::cluster/128       time:   [30.699 ms 32.509 ms 34.393 ms]
                        thrpt:  [3.6344 KiB/s 3.8451 KiB/s 4.0718 KiB/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high severe
Benchmarking Raft::cluster/512: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 58.3s or reduce sample count to 10
Raft::cluster/512       time:   [30.405 ms 32.548 ms 34.221 ms]
                        thrpt:  [14.611 KiB/s 15.362 KiB/s 16.445 KiB/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low severe
  1 (10.00%) low mild
Benchmarking Raft::cluster/1024: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.0s or reduce sample count to 10
Raft::cluster/1024      time:   [28.667 ms 29.879 ms 31.049 ms]
                        thrpt:  [32.207 KiB/s 33.468 KiB/s 34.883 KiB/s]
Benchmarking Raft::cluster/4096: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.3s or reduce sample count to 10
Raft::cluster/4096      time:   [29.035 ms 30.146 ms 31.429 ms]
                        thrpt:  [127.27 KiB/s 132.69 KiB/s 137.76 KiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
Benchmarking Raft::cluster/32768: Warming up for 500.00 ms
Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.7s or reduce sample count to 10
Raft::cluster/32768     time:   [28.394 ms 29.335 ms 30.127 ms]
                        thrpt:  [1.0373 MiB/s 1.0653 MiB/s 1.1006 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild

Hoverbear · 2019-11-02T14:41:06Z

Criterion complains about that benchmark time will be too long. Consider using raw benching.

Is this the error?

Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 56.6s or reduce sample count to 10

Fullstop000 · 2019-11-02T16:11:24Z

Yes, it's the problem. I limit both the sample count and measurement time to make the benchmark running time in an acceptable range temporarily. But the benching result seems not very desirable. Maybe the implementation brings too many overheads by channels and mutexes.
Do you have any suggestions :)? @Hoverbear

Hoverbear · 2019-11-04T18:32:13Z

Yeah, I think at this point we're mostly benchmarking channels and overheads of the benchmark itself. :(

That is kind of the realistic case though, most raft clusters are running over networks and this have much more network overhead than CPU time.

I wonder if it might be more valuable to try to capture the time it takes for a node (the leader and probably separately a follower) to process a received message from a client or other node, and then respond.

This way we could perhaps avoid measuring the channel/mutex times?

Fullstop000 · 2019-11-18T08:38:56Z

I start to believe that the channel-based cluster is hard to be benched by Criterion.rs because it can't handle long-running benchmarks very well (described in the referenced issue) for now after I tried many different approaches based on it :(.

So I re-consider the problem: the target is to measure the speed of committing logs which means we need to calculate the duration between the proposing and consuming the corresponding committed entry per proposal. Therefore, maybe we don't need an async communication style cluster since we can simulate things described above just in the leader node.
The total process from a proposal to an entry committing in the leader:

step a proposal
Send MsgAppends and get MsgAppendResps. I think this step can be simplified as a dynamic time costs according to the msg size but it might be hard to define the rule.
step all MsgAppendResp
Consume a Ready

And now we can have a total sync-styled process in the leader node and measure the duration easily without any overheads.

What do you think about this idea? @Hoverbear @hicqu

Hoverbear · 2019-11-18T20:46:25Z

@Fullstop000 So you'd mock the non-leader actions?

Fullstop000 · 2019-11-19T02:44:39Z

@Hoverbear Yes, though I still want a benchmark over the channel-based cluster (maybe create a specialized harness for this?).

And I realize that by the no overhead approach the whole routine is composed of several function calls and an RTT: Leader step(proposal) -> Send entry -> Follower step(MsgAppend) -> Send resp(nearly static time) -> Leader step(MsgAppendResp) -> Leader consume a Ready. The RTT part can be mocked well and it seems we can just bench the rest steps individually.

Hoverbear · 2019-11-22T22:03:19Z

Seems to make sense to me :)

add cluster benchmark

934ece5

Signed-off-by: Fullstop000 <fullstop1005@gmail.com>

Fullstop000 changed the title ~~Add cluster benchmark~~ [WIP] Add cluster benchmark Nov 7, 2019

Merge branch 'master' into cluster-benchmark

ce4717a

Hoverbear assigned Fullstop000 Nov 7, 2019

Hoverbear added Tooling/Testing CI, benchmarking, and testing infrastucture. Work In Progress A work in progress. labels Nov 7, 2019

Hoverbear requested review from Hoverbear and hicqu November 7, 2019 18:46

Merge branch 'master' into cluster-benchmark

ddd1fae

Fullstop000 mentioned this pull request Nov 15, 2019

The suggested measurement time is somewhat too long bheisler/criterion.rs#351

Closed

jopemachine mentioned this pull request May 8, 2023

Write performance measurement tests lablup/rraft-py#16

Open

BusyJay removed the request for review from Hoverbear April 17, 2024 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add cluster benchmark #315

[WIP] Add cluster benchmark #315

Fullstop000 commented Nov 1, 2019 •

edited

Hoverbear commented Nov 2, 2019

Hoverbear commented Nov 2, 2019

Fullstop000 commented Nov 2, 2019

Hoverbear commented Nov 4, 2019

Fullstop000 commented Nov 18, 2019 •

edited

Hoverbear commented Nov 18, 2019

Fullstop000 commented Nov 19, 2019 •

edited

Hoverbear commented Nov 22, 2019

[WIP] Add cluster benchmark #315

Are you sure you want to change the base?

[WIP] Add cluster benchmark #315

Conversation

Fullstop000 commented Nov 1, 2019 • edited

Problems

Hoverbear commented Nov 2, 2019

Hoverbear commented Nov 2, 2019

Fullstop000 commented Nov 2, 2019

Hoverbear commented Nov 4, 2019

Fullstop000 commented Nov 18, 2019 • edited

Hoverbear commented Nov 18, 2019

Fullstop000 commented Nov 19, 2019 • edited

Hoverbear commented Nov 22, 2019

Fullstop000 commented Nov 1, 2019 •

edited

Fullstop000 commented Nov 18, 2019 •

edited

Fullstop000 commented Nov 19, 2019 •

edited