New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBD performance regression in 27.0rc1 on Windows #29785
Comments
Does it also happen on another type of operating system? |
I haven't tested it with a different operating system. I can try WSL. |
yes, that'd be useful to check, so that it is easier to tell if it is a Windows build system bug, or a caused by something else. |
Can you also try a self-built Windows executable? |
I forgot to mention that optimization was enabled in commit 41e378a. So for a fair comparison, one would have to backport 41e378a to 26.0. I presume that benchmarking IBD is expensive. So maybe comparing the micro-benchmarks can provide a hint at which code is slower? |
It's not clear to me if there is still an issue here or not. Or is there still a regression when comparing our 26.x and 27.x Windows release binaries? If the only issue is in relation to self-compiled MSVC binaries, then this isn't a blocker for 27.x. |
Speaking of MSVC builds, it's worth noting that they still don't have a hardware-accelerated SHA256 implementation (see #24773). |
The original regression is also in variance which doesn't appear with the MSVC binary, so I don't think it's necessary to test the backport.
The regression is in the pre-built 27.0rc1 binary for Windows (which isn't MSVC), see original post. Pre-built binary for Linux tested on WSL and self-built MSVC binary for Windows don't have the regression, see later posts. |
I forgot to mention that benchmarks are disabled in guix. So something like |
Ok. It would be good if someone else on Windows can confirm this. Can you also let us know if you're using any particular config options etc. |
I can try and recreate this. @vostrnad do you have a script that records / plots the graphs? Also interested in how are you able to perform so many runs in such a short amount of time? Do you have access to a lot of compute? |
Running IBD in a loop, will report back tomorrow. |
Beyond the bare minimum (
I'm not doing full IBD, just to block 120,000. As much as I'd like to benchmark hundreds of full IBD runs, I don't have that kind of compute. |
Does it also appear with |
Or even with |
I've set up a benchmark that performs a partial IBD and then runs |
I was also going to block 120,000 but with public nodes. Re-read your initial post and see you are talking about syncing from a local node which then I assume makes the x axis of these charts in seconds, not minutes. I'll re-run my test connecting to a local node. |
Ran the original benchmark again (pre-built binaries on Windows) with only these configuration options:
This time 27.0rc1 was about 5% slower on average than 26.0, still with a much higher variance. Measured around 250 runs of each. |
I can partially replicate this regression using pre-built binaries on Windows 11. For me, 26.0 Mean: 35.21 seconds 27.0rc1 Mean: 40.77 seconds Method Machine: 14th Gen i9 processor, 96gb ram, 2tb nvme storage, win 11
I have a copy of the debug.log from all 200 runs so can make charts that shows progress over time if we think that's helpful or choose different start and end times. |
Without knowing the cause, there is little that can be done. Also, instead of IBD, the benchmarks can be used, if they differ enough. Though, they'll need to be enabled: |
@maflcko Just to clarify, what didn't regress was the release binary for Linux (which I assume is built with guix, not WSL) running in WSL. |
All of the release binaries are built using Guix. Windows is produced in Guix using GCC+Mingw-w64. We don't produce any release binaries using WSL or MSVC. |
I'm not quite seeing the same issue on my WIndows machine, although IBD is generally a lot slower (50 runs each, sync to 120,000)
Considering that this is rather difficult to figure out what is wrong, and it seems it might not be happening for everyone, I think it would be okay for us to deal with this later and move forward with the 27.0 final release. |
Did someone with Windows try to enable the benchmarks in guix and compile them, and run them? |
I can do |
Yes, I did that. I could not find any benchmark that had a significant difference. |
Thanks for confirming. If the benchmarks can not show a difference for someone who could reproduce, bisecting guix builds may be the only option left to debug this, but that will probably take some time. |
I did get a bit of variation but hard to know what's important. |
While investigating the variance in IBD when synchronizing from network nodes and comparing it to synchronizing from a local node, I noticed a significant slowdown when I switched the synchronizing node to 27.0rc1. When connected to a local node (also 27.0rc1), it reaches block 120,000 about 10% slower on average than 26.0, with a much higher variance. I measured around 100 runs of each, alternating between the versions every time. Headers (pre-)sync is not included in the measurement. OS is Windows 10, I'm using pre-built binaries for both versions.
26.0:
27.0rc1:
EDIT: The issue seems to be only with the pre-built release binary for Windows. Tests with a pre-built Linux binary on WSL and a self-built MSVC binary don't show the regression (see later posts).
The text was updated successfully, but these errors were encountered: