Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rush] When using cobuilds, no-op operations should not be treated as uncacheable #4660

Merged

Conversation

aramissennyeydd
Copy link
Contributor

Summary

This PR does 2 things,

  1. it adds logging for the cobuild build plan. It's currently pretty difficult to debug why cobuilds aren't performing as expected and some visibility into the clustering logic would be a good step in that direction.
  2. it fixes a bug where operations that are no-ops, like ignoreMissingScript or missingScriptBehavior: silent are treated as disabling the build cache causing all operations that depend on that operation to cluster. This causes some pretty extreme slow down for projects that use central dependencies that don't have build/test steps.

Details

No-op operations now no longer play into build cache cluster calculations. I'm also adding some visibility to the output for cobuilds so users can understand when this is happening. This solves the initial issue I was having, but additional operations may still have problems. While this is a bug fix, there may be projects that are incorrectly set up to build with cobuilds. This change may expose that and cause additional work by repo owners. However, I expect that they thought they were cobuilding all along. The bug fix should improve performance, the extra logging may decrease improvement and may be best to set behind verbose logging.

How it was tested

Tested a few different ways,

  1. In the cobuild sandbox repo, using rm -rf common/temp/build-cache && RUSH_COBUILD_CONTEXT_ID=foo REDIS_PASS=redis123 RUSH_COBUILD_RUNNER_ID=runner1 node ../../lib/runRush.js cobuild -p 10 and checking the output plan.
  2. In [rush] add support for sharding phases #4652's sharded-repo cobuild sandbox, simulating 2 runners and viewing the output. There was significantly less resource contention as the number of clusters went from 7 to 227.
  3. In our internal repo, where number of clusters went from 3 to 127.
  4. I also verified that adding disableBuildCacheForProject: true to rush-project.json caused the expected drop in clusters, adding it to the e project in the sharded-repo project dropped the number of clusters from 227 to 127 as expected.

@aramissennyeydd
Copy link
Contributor Author

Example output with --debug:

Build Plan Depth (deepest dependency tree): 5
Build Plan Width (maximum parallelism): 3
Number of Nodes per Depth: 1, 1, 2, 2, 3
Plan @ Depth 4 has 3 nodes and 0 dependents:
- f (build)
- g (build)
- e (build)
Plan @ Depth 3 has 2 nodes and 3 dependents:
- f (pre-build)
- g (pre-build)
Plan @ Depth 2 has 2 nodes and 5 dependents:
- d (build)
- a (build)
Plan @ Depth 1 has 1 nodes and 7 dependents:
- c (build)
Plan @ Depth 0 has 1 nodes and 8 dependents:
- b (build)
##################################################
        a (build): (4)
        b (build): (3)
        c (build): -(6)
 f (pre-build): -(9)
 g (pre-build): -(10)
        d (build): --(8)
         f (build): --(9)
        g (build): --(10)
        e (build): ---(12)
##################################################
Cluster 0:
- Dependencies: none
- Clustered by: 
  - none
- Operations: a (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 1:
- Dependencies: none
- Clustered by: 
  - none
- Operations: b (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 2:
- Dependencies: b (_phase:build)
- Clustered by: 
  - none
- Operations: c (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 3:
- Dependencies: b (_phase:pre-build)
- Clustered by: 
  - none
- Operations: b (build)
--------------------------------------------------
Cluster 4:
- Dependencies: a (_phase:pre-build)
- Clustered by: 
  - none
- Operations: a (build)
--------------------------------------------------
Cluster 5:
- Dependencies: b (_phase:build), c (_phase:build)
- Clustered by: 
  - none
- Operations: d (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 6:
- Dependencies: c (_phase:pre-build), b (_phase:build)
- Clustered by: 
  - none
- Operations: c (build)
--------------------------------------------------
Cluster 7:
- Dependencies: b (_phase:build), d (_phase:build)
- Clustered by: 
  - none
- Operations: e (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 8:
- Dependencies: d (_phase:pre-build), b (_phase:build), c (_phase:build)
- Clustered by: 
  - none
- Operations: d (build)
--------------------------------------------------
Cluster 9:
- Dependencies: b (_phase:build)
- Clustered by: 
  - (f (pre-build)) "Caching has been disabled for this project."
- Operations: f (pre-build), f (build)
--------------------------------------------------
Cluster 10:
- Dependencies: b (_phase:build)
- Clustered by: 
  - (g (pre-build)) "Project does not have a rush-project.json configuration file, or one provided by a rig, so it does not support caching."
- Operations: g (pre-build), g (build)
--------------------------------------------------
Cluster 11:
- Dependencies: a (_phase:build)
- Clustered by: 
  - none
- Operations: h (pre-build) [SKIPPED]
--------------------------------------------------
Cluster 12:
- Dependencies: e (_phase:pre-build), b (_phase:build), d (_phase:build)
- Clustered by: 
  - none
- Operations: e (build)
--------------------------------------------------
Cluster 13:
- Dependencies: h (_phase:pre-build), a (_phase:build)
- Clustered by: 
  - none
- Operations: h (build) [SKIPPED]

Copy link
Contributor

@dmichon-msft dmichon-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mechanically this seems fine, though it will result in the Rush build cache engine doing considerably more work for noops than it used to, unless we still have logic to skip the actual cache reads/writes for noops.

@aramissennyeydd
Copy link
Contributor Author

@dmichon-msft I think no-ops should still be skipped bc the null operation runner has cacheable = false and https://github.com/aramissennyeydd/rushstack/blob/da87eea7b88dc81e28bafb5abecd0e375986dbd6/libraries/rush-lib/src/logic/operations/CacheableOperationPlugin.ts#L243-L245 should exit early.

@aramissennyeydd aramissennyeydd force-pushed the fix-caching-behavior-undefined branch 2 times, most recently from b9a70cf to 3ecc6cc Compare May 10, 2024 14:00
@octogonz
Copy link
Collaborator

@iclanton @dmichon-msft Are we ready to merge this? This PR has been open for nearly a month.

@octogonz octogonz changed the title fix(cobuilds): no-op operations should not be treated as uncacheable [rush] When using cobuilds, no-op operations should not be treated as uncacheable May 14, 2024
aramissennyeydd and others added 5 commits May 24, 2024 20:56
Co-authored-by: Ian Clanton-Thuon <iclanton@users.noreply.github.com>
…est.ts

Co-authored-by: Ian Clanton-Thuon <iclanton@users.noreply.github.com>
@iclanton iclanton force-pushed the fix-caching-behavior-undefined branch from 7f2f698 to 399bd21 Compare May 25, 2024 02:15
@iclanton iclanton force-pushed the fix-caching-behavior-undefined branch from 399bd21 to ffb6427 Compare May 25, 2024 02:16
@iclanton iclanton enabled auto-merge May 25, 2024 02:18
@iclanton iclanton merged commit 4e2688e into microsoft:main May 25, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

None yet

5 participants