Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: loxilb starts consuming 100% CPU only after a few seconds #499

Open
luisgerhorst opened this issue Jan 19, 2024 · 4 comments
Open
Labels
bug Something isn't working

Comments

@luisgerhorst
Copy link
Contributor

Describe the bug

When I run one wrk2 worker against one nginx worker through loxilb as configured by cicd/tcpsctpperf first only wrk2 and nginx consume about 70 and 80 percent of CPU, but after a few seconds loxilb starts consuming 100% CPU (in kernel mode) too. I have pinned wrk2/nginx/loxilb to separate cores.

Does anyone know why this happens?

I am not sure if htop attributes the BPF program runtime to loxilb or if only the user process is included.

Expected behavior

I simply find it odd that the CPU load seems to only pick up after a few seconds. Is this maybe some logging that should be disabled for performance tests?

@luisgerhorst luisgerhorst added the bug Something isn't working label Jan 19, 2024
@TrekkieCoder
Copy link
Collaborator

Thanks for bringing this to notice. It seems strange but we will have a look and update soon.

@nik-netlox
Copy link
Collaborator

nik-netlox commented Jan 22, 2024

Hi @luisgerhorst, we have tried to reproduce this with loxilb latest docker but in our test we couldn't find this issue. nginx and wrk seems to be taking only 10%. We used validation-wrk script to test this. If you are using some other config/steps then please share with us and we will try with them. You may join our slack channel, we will be able to assist you better.

@luisgerhorst
Copy link
Contributor Author

luisgerhorst commented Jan 24, 2024

I'm sorry for the incomplete description. I have been running wrk2 at a much higher rate than the version merged (12.5k RPS, roughly 80% of the max. on my machine).

I run https://github.com/luisgerhorst/loxilb/blob/ccf029a1f6cf8b914b23909d4cc922a4c32662d0/cicd/tcpsctpperf/validation-wrk using OSE_PERF_STAT="perf stat" OSE_LOXILB_SERVERS=1 OSE_LATENCY_PAYLOAD_SIZE=1024 ./validation-wrk 2 60 $(pwd)/ 100 | tee v.log against loxilb v0.9. Using parca I was able to record a CPU trace of the behaviour I observed:

Screenshot from 2024-01-24 18-27-06

The screenshot has the x axis separated into 6 segments. The dark purple line is loxilb which is mostly idle in segments 3 and 4 and ramps up in segments 2, 5, and 6. The red/green line in segments 3-6 are wrk2 and nginx (red/orange in 1-2). The light purple line is parca which periodically processes the cpu samples collected.

Here's the CPU profile while loxilb is in it's idle phase: https://pprof.me/2d2a1527503cfa10ae0a46890b2cb3a0

And here's the CPU profile when loxilb is in it's busy phase: https://pprof.me/414cf26812ea9d7d3b973563bec491ed

Iterestingly, loxilb seems to consume 100% here even thougth the benchmark is not at it's limit (I can achieve 15.5k RPS using this same setup). Therefore, maybe this behaviour is not acutally limiting the performance (at least directly).

@UltraInstinct14
Copy link
Contributor

UltraInstinct14 commented Feb 1, 2024

Loxilb has a garbage collector which monitors its connection-track entries. If a connection goes through its normal life cycle - e.g init, init-ack, est, fin etc, eBPF module itself cleans up the CT entries. But for half-cooked connections, the garbage collector comes into play. Currently, it is set to aggressive GC.One potential solution is to trigger GC only when there is a space pressure in its CT map.

UltraInstinct14 added a commit that referenced this issue Feb 2, 2024
PR : gh-499 relaxed garabge collector characteristics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants