-
-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Later benchmarks are notoriously slower, possibly due to CPU throttling #954
Comments
I don't see this behavior (Linux), e.g. 4 different bcrypt benches:
which platform are you on?
Non-linear exceution is planned, but not sure when I'll get round to it as it's a pretty large refactoring.
|
Hi
Interesting. Typical output for me:
Linux, PHP 7.4.
Cool. Although this only solves relative differences within the same run, not differences between runs.
Already using it. Perhaps I need to warm up stronger.. |
This is interesting. At least on Linux perhaps CPUscaling can be controlled: https://askubuntu.com/questions/523640/how-i-can-disable-cpu-frequency-scaling-and-set-the-system-to-performance For me I also note that my CPUs are running at different frequencies ( |
Problem
I have a test case like this:
When running all of them, the later (copied) version is notoriously slower than the previous version. E.g. A2 will be slower than A1, and B2 will be slower than B1. I saw differences higher than 10%, and this was reproducible.
I assume this is because the processor gets heated and will throttle when it hits the later benchmark cases.
This can lead to misleading results when comparing the performance of different algorithms.
Solution 1: Sleep
We can use the
--sleep
parameter or@Sleep
annotribute, hoping that the processor will cool down.I tried with
--revs=1 --iterations=50 --sleep=10000
, it seems to improve stability sometimes, but the problem does not fully go away.Perhaps I need to use higher sleep numbers?
One problem with this is that a user won't see if the sleep number was "high enough", because usually there are no 1:1 copied benchmark methods.
Solution 2: Mix it up
In the past, I created a custom benchmark tool where I would break the strict ordering of benchmark cases.
E.g. I would run A, B, A, B, A, B instead of A, A, A, B, B, B.
While this makes a more "fair" comparison of A vs B in the current run, it does not help with differences to earlier runs, where the CPU might have been less heated.
Solution 3: Pre-stress the processor
Heat up the processor so that all the benchmark cases suffer from the same throttling.
I don't know if a processor will reach a fixed throttling level, or if it will throttle more and more.
I assume it depends on many factors.
Solution 4: Measure current raw CPU speed?
A reference operation could be used to measure the current CPU speed.
A report could show benchmark times divided by the duration for the reference operation.
Problems with this:
The text was updated successfully, but these errors were encountered: