[WIP] plaintext benchmark #9

jangko · 2018-08-22T15:42:23Z

finishing bench bot implementation
add participant: rust actix
add participant: go-lang fasthttp
add participant: c libreactor
polishing benchmark report
completing thread/nothread stuff
rewrite response section code
add nodocker script

cheatfate · 2018-08-27T18:36:32Z

I'm sorry @jangko, there no reason to make benchmarks on multi-threaded vs single-threaded apps. Could you please try to limit number of processes/threads used by chosen framework?

jangko · 2018-08-28T01:58:59Z

Could you please try to limit number of processes/threads used by chosen framework?

sure, i will try to make the benchmark as fair as possible for each participant

cheatfate · 2018-08-28T09:28:13Z

From my tests on VM with just only 2 processors available, mofuw is not so performant, and also it produces more errors, not successful responses:

Running 10s test @ http://127.0.0.1:34500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.02ms    4.57ms  61.20ms   97.04%
    Req/Sec    17.78k     6.53k   32.87k    59.20%
  355475 requests in 10.10s, 44.75MB read
  Non-2xx or 3xx responses: 355475
Requests/sec:  35197.01
Transfer/sec:      4.43MB
./wrk http://127.0.0.1:34500  1.02s user 4.18s system 51% cpu 10.105 total
cheatfate@phantom ~/wrk (git)-[master] % ./wrk http://127.0.0.1:34500
Running 10s test @ http://127.0.0.1:34500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.12ms   14.53ms 151.12ms   95.69%
    Req/Sec    17.17k     2.89k   20.92k    82.67%
  345100 requests in 10.10s, 43.44MB read
  Non-2xx or 3xx responses: 345100
Requests/sec:  34168.76
Transfer/sec:      4.30MB
./wrk http://127.0.0.1:34500  1.13s user 4.13s system 51% cpu 10.110 total
cheatfate@phantom ~/wrk (git)-[master] % ./wrk http://127.0.0.1:34500
Running 10s test @ http://127.0.0.1:34500
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.55ms   22.19ms 220.08ms   95.90%
    Req/Sec    16.85k     2.76k   31.98k    82.59%
  337011 requests in 10.10s, 42.42MB read
  Non-2xx or 3xx responses: 337011
Requests/sec:  33366.97
Transfer/sec:      4.20MB
./wrk http://127.0.0.1:34500  1.23s user 3.78s system 49% cpu 10.145 total

cheatfate · 2018-08-28T09:33:24Z

While on the same VM asyncdispatch2 benchmark produces such output:

cheatfate@phantom ~/wrk (git)-[master] % ./wrk http://127.0.0.1:8885
Running 10s test @ http://127.0.0.1:8885
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   208.75us  184.01us  10.27ms   98.49%
    Req/Sec    24.09k     3.90k   28.10k    77.23%
  484088 requests in 10.10s, 24.93MB read
Requests/sec:  47929.24
Transfer/sec:      2.47MB
./wrk http://127.0.0.1:8885  1.06s user 5.18s system 61% cpu 10.104 total
cheatfate@phantom ~/wrk (git)-[master] % ./wrk http://127.0.0.1:8885
Running 10s test @ http://127.0.0.1:8885
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   182.87us  107.18us   6.93ms   97.22%
    Req/Sec    26.52k     1.07k   29.06k    59.00%
  527788 requests in 10.01s, 27.18MB read
Requests/sec:  52746.72
Transfer/sec:      2.72MB
./wrk http://127.0.0.1:8885  1.48s user 5.84s system 73% cpu 10.009 total
cheatfate@phantom ~/wrk (git)-[master] % ./wrk http://127.0.0.1:8885
Running 10s test @ http://127.0.0.1:8885
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   218.96us  240.94us  10.37ms   98.65%
    Req/Sec    23.38k     4.37k   28.07k    70.79%
  469746 requests in 10.10s, 24.19MB read
Requests/sec:  46510.70
Transfer/sec:      2.40MB
./wrk http://127.0.0.1:8885  1.05s user 4.96s system 59% cpu 10.103 total

As you can see here, there no Non-2xx or 3xx responses: 337011. So normal HTTP answers got received by wrk.

jangko · 2018-08-29T05:15:38Z

mofuw need /plaintext uri to avoid Non-2xx or 3xx responses

The performance difference between your benchmark and mine is because I use pipeline switch turn on.
When the pipeline switch added to wrk, mofuw performance will be higher than ad2.

cheatfate · 2018-08-29T07:55:27Z

@jangko, from what i see asyncdispatch and asyncdispatch2 benchmarks - do not support pipeline messages. So why you are testing it?

jangko · 2018-08-29T09:16:39Z

Most of performant techempower benchmark participant are designed to handle pipeline messages. On the other hand, this benchmark does not take it into account.
While testing those frameworks, I realize their performance can vary significantly with/without pipeline mode. I think it would be important to keep this information.
The final result of this benchmark will include both pipeline and no-pipeline mode for comparison, or it will become bench-bot switchable feature.
What we can do now is make them all run in single thread mode. Then we can decide what we will need to do with this pipeline.

cheatfate · 2018-08-29T09:32:03Z

But you can adjust benchmark source to support pipeline for both asyncdispatch and asyncdispatch2.

jangko · 2018-08-29T09:50:21Z

But you can adjust benchmark source to support pipeline for both asyncdispatch and asyncdispatch2.

agree

jangko · 2018-09-03T05:02:43Z

ad2 is fast but then suffer massive slowdown, hmm. interesting

arnetheduck · 2018-09-10T21:37:25Z

.travis.yml

@@ -37,3 +37,5 @@ install:
 script:
  - nimble install -y
  - nimble test
+  - nimble benchmark


what's the purpose of running benchmark on CI setup? neither travis nor appveyor provide stable hardware so numbers will be meaningless, and benchmarks tend to take a while so it will slow down every PR / build roundtrip..

jangko · 2018-09-11T12:57:33Z

and benchmarks tend to take a while so it will slow down every PR / build roundtrip..

That's right, it took significant amount of time. I already removed it from CI.

Summary

mofuw, mofuw use asyncdispatch, expected performance should not more than asynchdispatch itself.
asyncdispatch, although it is slower than asyncdispatch2, it can handle high concurrency quite well.
asyncdispatch2, at high concurrency it has tendency become slower significantly,
but surpringsingly it is the only framework in this test that can handle non pipeline request faster than other frameworks although using almost identical code with asyncdispatch when handle request/response.
actix-raw, very fast when multi threaded, not so when single threaded.
fasthttp, very fast when multi threaded, not so when single threaded.
libreactor, still very fast although in single thread mode.

Conclusion

asyncdispatch2 could be a good candidate to replace asycndispatch
it still has room for improvement especially when handle high count connections.

Sorry I cannot work faster because of some circumstances, but I think this one is ready for review.

dm1try · 2018-09-15T17:43:01Z

looks like asyncdispatch2 benchmark has broken implementation at least on Mac OS: it generates ~10x responses for the same request(provided results have a similar correlation)

wrk is going crazy in that way:

wrk -c 30 -d 15s -t 4 http://localhost:8080/
Running 15s test @ http://localhost:8080/
  4 threads and 30 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   204.63us  122.29us   3.45ms   66.65%
    Req/Sec   329.58k    22.97k  376.98k    63.91%
  19802436 requests in 15.10s, 2.56GB read
Requests/sec: 1311431.24
Transfer/sec:    173.84MB

The test programs are supposed to check for the 'USE_THREADS' environment variable to decide wether threads should be used. So far, this has been implemented in the Go and Rust benchmarks. The bot will set this variable by default. Other changes: * The Rust implementation has been updated with the latest code from the TechEmperor benchmark and the incremental Docker build process has been greatly improved

zah · 2018-09-21T17:37:05Z

@jangko, I've pushed to your branch a commit adding a command-line option for deciding whether threads should be used. To support it, the test programs need a minor modification - they must check whether the environment variable USE_THREADS is set. You can see an example here:

4fa3b6e#diff-a700604e55a2b00d28959045bdda5b09R26

I've added this to the Rust and Go programs, but can we also add it to the rest of the examples?

The asyncdispatch2 program that you've prepared is violating the rules of the competition, which are given here:
https://www.techempower.com/benchmarks/#section=code

In particular, this rule:

This test is not intended to exercise the allocation of memory or instantiation of objects. Therefore it is acceptable but not required to re-use a single buffer for the response text (Hello, World). However, the response must be fully composed from the response text and response headers within the scope of each request and it is not acceptable to store the entire payload of the response, or an unnaturally large subset of the response, headers inclusive, as a pre-rendered buffer. "Buffer" here refers to a byte array, byte buffer, character array, character buffer, string, or string-like data structure. The spirit of the test is to require the construction of the HTTP response as is typically done by a framework or platform via concatenation of strings or similar. For example, pre-rendering a buffer with HTTP/1.1 200 OKContent-length: 15Server: Example would not be acceptable.

So, you must break up a bit the strings being written as a response. I think you can avoid some of the allocations and concatenations as well, @cheatfate may provide some hints for what is the most efficient way to build the response piece by piece.

Add tests for status-im#9. Temporary disable some tests in testaddress.nim.

benchmark initial implementation

77e1834

jangko mentioned this pull request Aug 23, 2018

Implement some basic benchmarks #5

Open

jangko added 3 commits August 24, 2018 17:36

working prototype

34c49ef

add fasthttp

a783ff9

add rust actix

93c10e4

jangko added 2 commits August 30, 2018 11:51

single thread mofuw and actix

5ced571

asyncnet and asyncdispatch2 support pipeline mode

c2a059d

jangko added 3 commits September 6, 2018 19:25

fix ad2 benchmark

9edfb9b

Merge remote-tracking branch 'upstream/master' into benchmark

14fe5fb

add benchmark test to ci

6c2b7f7

arnetheduck reviewed Sep 10, 2018

View reviewed changes

add libreactor and summary

3ad2529

fix bot.nim

b457c22

zah mentioned this pull request Sep 21, 2018

[Research] Sparse Merkle Trie status-im/nim-eth-trie#19

Closed

add thread or nothread stuff

4f6cb70

bung87 pushed a commit to bung87/nim-chronos that referenced this pull request Nov 17, 2020

Fix status-im#9.

ace6594

Add tests for status-im#9. Temporary disable some tests in testaddress.nim.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] plaintext benchmark #9

[WIP] plaintext benchmark #9

jangko commented Aug 22, 2018 •

edited

cheatfate commented Aug 27, 2018

jangko commented Aug 28, 2018

cheatfate commented Aug 28, 2018 •

edited

cheatfate commented Aug 28, 2018 •

edited

jangko commented Aug 29, 2018 •

edited

cheatfate commented Aug 29, 2018

jangko commented Aug 29, 2018

cheatfate commented Aug 29, 2018

jangko commented Aug 29, 2018

jangko commented Sep 3, 2018

arnetheduck Sep 10, 2018

jangko commented Sep 11, 2018

dm1try commented Sep 15, 2018 •

edited

zah commented Sep 21, 2018 •

edited

[WIP] plaintext benchmark #9

Are you sure you want to change the base?

[WIP] plaintext benchmark #9

Conversation

jangko commented Aug 22, 2018 • edited

cheatfate commented Aug 27, 2018

jangko commented Aug 28, 2018

cheatfate commented Aug 28, 2018 • edited

cheatfate commented Aug 28, 2018 • edited

jangko commented Aug 29, 2018 • edited

cheatfate commented Aug 29, 2018

jangko commented Aug 29, 2018

cheatfate commented Aug 29, 2018

jangko commented Aug 29, 2018

jangko commented Sep 3, 2018

arnetheduck Sep 10, 2018

Choose a reason for hiding this comment

jangko commented Sep 11, 2018

Summary

Conclusion

dm1try commented Sep 15, 2018 • edited

zah commented Sep 21, 2018 • edited

jangko commented Aug 22, 2018 •

edited

cheatfate commented Aug 28, 2018 •

edited

cheatfate commented Aug 28, 2018 •

edited

jangko commented Aug 29, 2018 •

edited

dm1try commented Sep 15, 2018 •

edited

zah commented Sep 21, 2018 •

edited