Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase plugin performance #2

Open
acouvreur opened this issue Jan 3, 2022 · 13 comments
Open

Increase plugin performance #2

acouvreur opened this issue Jan 3, 2022 · 13 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@acouvreur
Copy link
Owner

Benchmark using https://github.com/codesenberg/bombardier

Traefik access without WAF

bombardier http://localhost:8000/no-waf
Bombarding http://localhost:8000/no-waf for 10s using 125 connection(s)
[===============================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec     14430.33    1895.67   17523.04
  Latency        8.66ms     6.73ms   183.34ms
  HTTP codes:
    1xx - 0, 2xx - 144272, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     7.25MB/s

Traefik access with WAF

bombardier http://localhost:8000/website2
Bombarding http://localhost:8000/website2 for 10s using 125 connection(s)
[===============================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec       748.37     276.34    2325.02
  Latency      166.17ms   140.42ms      1.11s
  HTTP codes:
    1xx - 0, 2xx - 7565, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   386.44KB/s

The WAF makes the Throughput almost 20x times slower.

  • Is it from the Go plugin ?
  • Is it from the extra request ?
  • Is it from the WAF engine ?

Or all combined ?

This is the test for direct access through WAF, no Traefik

bombardier http://localhost:8001/direct-waf                                      ✔  10s   14:33:19  
Bombarding http://localhost:8001/direct-waf for 10s using 125 connection(s)
[===============================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec      1156.55     319.56    2558.76
  Latency      107.97ms   101.67ms      0.87s
  HTTP codes:
    1xx - 0, 2xx - 11632, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   634.93KB/s

So the plugin might have room for improvement.

@acouvreur acouvreur added enhancement New feature or request help wanted Extra attention is needed labels Jan 3, 2022
@acouvreur acouvreur self-assigned this Jan 3, 2022
@ftoppi
Copy link

ftoppi commented Jan 22, 2022

Hello @acouvreur , did you try with containous/whoami or nginx (with hardcoded return 200) as a WAF backend?

How does go handle connections towards the WAF? Does it reuse them or establishes new one for each request?

@acouvreur
Copy link
Owner Author

Hello @acouvreur , did you try with containous/whoami or nginx (with hardcoded return 200) as a WAF backend?

I used it with containous/whoami as WAF backend

How does go handle connections towards the WAF? Does it reuse them or establishes new one for each request?

It actually creates a new request that it forwards to the WAF. Then depending on the response, the plugin sends the initial request to subsequent middlewares / service.

How can I reuse the the request ?

@ftoppi
Copy link

ftoppi commented Jan 22, 2022

Unfortunately I don't know golang at all 😅 I just saw many syn/fin between traefik and the waf container.

I found this article stating the response body should be read before being closed in order to reuse the connection, not sure if accurate or feasible: https://golang.cafe/blog/how-to-reuse-http-connections-in-go.html

@zoltan-fedor
Copy link

zoltan-fedor commented Feb 16, 2022

Hi @acouvreur
I too am very interested in a WAF middleware for Traefik, so I was happy to see this repo. It is looking very promising, great job!

Unfortunately I am a Python dev, not Go (Go is on my list of things to learn for this year), so not quite sure why that additional 60ms average latency when calling Modsecurity via WAF vs when calling directly.

Even without Go knowledge my first guess would be HTTP (TCP) connection reuse - or rather the lack of it.

I saw this line, where you are deferring the closure of the body, but reading this the author is recommending closing the body completely before issuing a new HTTP request (and not deferring it):

    // 1st request
    req, err := http.NewRequestWithContext(traceCtx, http.MethodGet, "http://example.com", nil)
    if err != nil {
        log.Fatal(err)
    }
    res, err := http.DefaultClient.Do(req)
    if err != nil {
        log.Fatal(err)
    }
    if _, err := io.Copy(ioutil.Discard, res.Body); err != nil {
        log.Fatal(err)
    }
    res.Body.Close()  // <= NOTE THIS, NO USE OF "defer"
    // 2nd request
    req, err = http.NewRequestWithContext(traceCtx, http.MethodGet, "http://example.com", nil)
    if err != nil {
        log.Fatal(err)
    }

Would you mind testing that whether that helps?

Although it seems other Middlewares are also using defer with the Body.Close(), although not sure if those were tested for performance:
https://github.com/thomseddon/traefik-forward-auth/blob/4ffb6593d569801cf7c4571542c0ff03afcb1a0f/internal/provider/google.go#L89

@sh4sh
Copy link

sh4sh commented May 4, 2022

+1 excited about a WAF middleware for Traefik! Thanks for your work @acouvreur, this is awesome.

My understanding is that defer res.Body.Close() will close the connection at function exit, so it would impact connection reuse if there are more requests before the function exits. Is it possible that the connection is not closing before a.next.ServeHTTP(rw, req) starts a new request?

defer resp.Body.Close()
if resp.StatusCode >= 400 {
forwardResponse(resp, rw)
return
}
a.next.ServeHTTP(rw, req)
}

@acouvreur
Copy link
Owner Author

acouvreur commented May 5, 2022

Thanks all for your interest, I just tested it without the defer and it seems that it doesn't affect the throughput that much.

Before

Bombarding http://localhost:8000/website for 10s using 125 connection(s)
[============================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec       904.17     330.29    3101.30
  Latency      137.67ms   122.62ms      1.14s
  HTTP codes:
    1xx - 0, 2xx - 9120, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   466.71KB/s

After

- defer resp.Body.Close() 
+ resp.Body.Close() 
Bombarding http://localhost:8000/website for 10s using 125 connection(s)
[============================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec       929.24     321.40    2767.64
  Latency      134.33ms   120.04ms      0.94s
  HTTP codes:
    1xx - 0, 2xx - 9370, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   477.05KB/s

@psavva
Copy link

psavva commented May 7, 2022

I'm also very interested in using this plugin, so following and hopefully able to contribute when time allows

@Enrico204
Copy link
Contributor

Enrico204 commented Aug 4, 2022

Traefik allows CGo? In that case the middleware may load and use the native mod-security module, eliminating the HTTP round-trip (and possibly a large number of issues, like buffering). I have no time for digging into this at the moment, I'll in the future I hope :-)

EDIT: Traefik uses Yaegi, and unfortunately Yaegi does not support CGo :-(

@Enrico204
Copy link
Contributor

I made some modifications, just to be sure that Go is re-cycling HTTP client connections:

diff --git a/modsecurity.go b/modsecurity.go
index 42b24b2..59fb335 100644
--- a/modsecurity.go
+++ b/modsecurity.go
@@ -94,12 +94,14 @@ func (a *Modsecurity) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
                http.Error(rw, "", http.StatusBadGateway)
                return
        }
-       defer resp.Body.Close()
 
        if resp.StatusCode >= 400 {
                forwardResponse(resp, rw)
+               resp.Body.Close()
                return
        }
+       io.Copy(ioutil.Discard, resp.Body)
+       resp.Body.Close()
 
        a.next.ServeHTTP(rw, req)
 }

I have these values now:

Website with WAF:

$ bombardier http://localhost:8000/website
Bombarding http://localhost:8000/website for 10s using 125 connection(s)
[==========================================================================================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec      1403.55     324.75    3202.44
  Latency       88.94ms   100.71ms      1.13s
  HTTP codes:
    1xx - 0, 2xx - 14110, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   720.81KB/s

Website without WAF:

$ bombardier http://localhost:8000/no-waf
Bombarding http://localhost:8000/no-waf for 10s using 125 connection(s)
[==========================================================================================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec      7771.16    2846.54   13136.86
  Latency       16.10ms    18.20ms   324.91ms
  HTTP codes:
    1xx - 0, 2xx - 77655, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     3.90MB/s

Benchmarking the WAF directly:

$ bombardier http://localhost:8001
Bombarding http://localhost:8001 for 10s using 125 connection(s)
[==========================================================================================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec      2094.39     365.61    3259.46
  Latency       59.67ms    67.18ms   819.28ms
  HTTP codes:
    1xx - 0, 2xx - 20978, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     1.08MB/s

So I was able to lower the "ratio of slowness" from 12.2x (see below my previous values) to ~5.5x. I tried also to avoid DNS lookups for waf endpoint, however it wasn't helping. the remaining ~13ms (in my case) are from the Go HTTP client itself.

Also, I tried to switch to fasthttp, however the Yaegi library (that Traefik is using for plugins) does not support some code used in that library :-(


Previous values for Website + WAF:

$ bombardier http://localhost:8000/website
Bombarding http://localhost:8000/website for 10s using 125 connection(s)
[==========================================================================================================================================================================] 10s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec       641.51     492.43    2103.32
  Latency      196.44ms   276.24ms      1.31s
  HTTP codes:
    1xx - 0, 2xx - 6528, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   323.35KB/s

@zoltan-fedor
Copy link

That is some great progress in performance!!!

I am glad to see, that making Go to re-cycle the http connections (switching defer resp.Body.Close() to resp.Body.Close()) - as it was suspected earlier - in fact has helped.

@Enrico204
Copy link
Contributor

That is some great progress in performance!!!

I am glad to see, that making Go to re-cycle the http connections (switching defer resp.Body.Close() to resp.Body.Close()) - as it was suspected earlier - in fact has helped.

Yup, it has helped a lot. That line, plus io.Copy(ioutil.Discard, resp.Body) are switching http.Client to keep-alive mode :-)

@connordwoods
Copy link

Hi @acouvreur and @Enrico204

I am interested in using this middleware, but I am not yet a Go developer (learning now). It is good to see progress being made regarding performance.

I see that @Enrico204 you tried to use fasthttp. I was going to suggest this as well, but did not know it was unsupported. I did some digging, and it seems that Traefik realize there is demand for fasthttp support, and are considering replacing net/http with fasthttp in the next major version. This could mean a significant performance increase is on the horizon.

@Enrico204
Copy link
Contributor

I see that @Enrico204 you tried to use fasthttp. I was going to suggest this as well, but did not know it was unsupported. I did some digging, and it seems that Traefik realize there is demand for fasthttp support, and are considering replacing net/http with fasthttp in the next major version. This could mean a significant performance increase is on the horizon.

Unfortunately the problem is that fasthttp is not compatible with the plugin system (Yaegi), so inside plugins we'll be limited to net/http. Plugins and Traefik don't share the same instance - plugins are executed in a sort of interpreter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants