You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CreateProcess on Windows is slow at the best of times, and Chrome developers often find themselves not in the best of times. Some process creation/destruction issues that have been dealt with over the last few years in the Chrome build include:
Chrome builds hit UserCrit lock contention during process destruction causing build slowdowns and significant mouse stuttering during goma builds due to a Windows regression. Fixed in Windows, fix eventually propagated to all relevant Windows 10 builds (https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/)
Chrome builds hit UserCrit lock contention during process destruction causing build slowdowns and significant mouse stuttering during goma builds due to gomacc.exe indirectly pulling in gdi32.dll. Fixed in gomacc by avoiding gdi32.dll.
Anti-malware software caused an ~11 ms regression in CreateProcess making it impossible for a full Chrome build to be faster than ~ten minutes, and slowing all builds. Fixed by disabling this anti-malware software.
At least two variations on problem number 3 including other anti-malware software that is either not correctly disabled or still has overhead (MsMpEng.exe) even when disabled.
Other process creation/destruction issues that didn't affect Chrome builds include:
Chromium's large test binaries hit an O(n^2) CFG performance bug hit in Windows which caused unit_tests.exe to take approximately 5x as long as it should on Windows 10. Fixed by disabling CFG in test binaries (https://randomascii.wordpress.com/2019/04/21/on2-in-createprocess/)
clang-cl unit tests hit UserCrit lock contention during process destruction causing the tests to take 5x as long as they should on Windows 10 and causing significant mouse stuttering. Fixed by avoiding pulling in gdi32.dll.
Currently all of these issues are mitigated, with the exception of MsMpEng.exe causing non-zero overhead even when everything is in excluded folders. However there is ongoing effort needed to keep all Chrome developers' machines properly configured, and some loss of security because of the many exclusions needed to maintain decent build performance. There is also ongoing effort to detect and investigate these issues.
At some point it is no longer worth tilting at the CreateProcess windmill and instead avoid calling CreateProcess so frequently. ninja.exe could be taught how to use IPC to communicate with a local server process that would manage a pool of compile processes (goma and/or clang-cl) in order to avoid 99% of CreateProcess calls during a build.
Doing this would reduce the CPU cost of CreateProcess, would remove CreateProcess from the serialized critical path in ninja, would allow more security monitoring to be enabled with fewer exclusions, and would save us from the costs (investigation time and build time) of future regressions.
Alternately ninja could multi-thread CreateProcess but this is not as complete a solution.
Switching from CreateProcess to IPC in ninja is not clearly the right thing to do, but it is worth discussing. A prototype has been created so test results should be shared here.
The text was updated successfully, but these errors were encountered:
Not surprisingly, I strongly disagree with this. Ninja is a simple build system that assumes that the build environment is sane. There are other build systems with different designs that might assume hostile build environments, but ninja is not that system.
It sounds like ninja's current design help identify lots of bugs that were worth fixing. I consider this a feature.
My understanding is that a colleague has a prototype for ninja-with-IPC.
My main concern about closing this without more discussion is that it misses an opportunity to discuss the security tradeoffs. Most of the ways that we have mitigated the CreateProcess costs/serialization are by disabling various security checks. I'm not an expert on the value of those checks but it could be instructive to have those who are experts weigh in on the benefits of being able to enable them.
CreateProcess on Windows is slow at the best of times, and Chrome developers often find themselves not in the best of times. Some process creation/destruction issues that have been dealt with over the last few years in the Chrome build include:
Other process creation/destruction issues that didn't affect Chrome builds include:
Currently all of these issues are mitigated, with the exception of MsMpEng.exe causing non-zero overhead even when everything is in excluded folders. However there is ongoing effort needed to keep all Chrome developers' machines properly configured, and some loss of security because of the many exclusions needed to maintain decent build performance. There is also ongoing effort to detect and investigate these issues.
At some point it is no longer worth tilting at the CreateProcess windmill and instead avoid calling CreateProcess so frequently. ninja.exe could be taught how to use IPC to communicate with a local server process that would manage a pool of compile processes (goma and/or clang-cl) in order to avoid 99% of CreateProcess calls during a build.
Doing this would reduce the CPU cost of CreateProcess, would remove CreateProcess from the serialized critical path in ninja, would allow more security monitoring to be enabled with fewer exclusions, and would save us from the costs (investigation time and build time) of future regressions.
Alternately ninja could multi-thread CreateProcess but this is not as complete a solution.
Switching from CreateProcess to IPC in ninja is not clearly the right thing to do, but it is worth discussing. A prototype has been created so test results should be shared here.
The text was updated successfully, but these errors were encountered: