Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use flatware to parallelize CI specs #27663

Closed
wants to merge 1 commit into from

Conversation

mjankowski
Copy link
Contributor

In addition to #27660 - here's another idea for maintaining some parallelization setup, while not paying the cost of all the setup times.

Opening this initially because I want to see if the CI setup/speedup is as easy as it was locally. Will review the actual numbers and post about that after it completes...

@mjankowski
Copy link
Contributor Author

It's challening to get good numbers here because it's so variable on each run. Some recent actions runs (all from after merging in the matrix reduction):

PR 3.0 3.1 3.2
27660 ~9m ~12m ~8:30
26559 ~14:30 11m ~9m
8214 9 12 8
Flatware PR 4 7 7

Those are all looking just at the narrow rspec run portion of the larger job. I think the discrepancy in the flatware times is that the ruby 3.0 job is using 4 cores and the others are using 2 each.

Another benefit here might be to cache the rspec status persistence file between runs -- if it's present flatware can use it to split tests between cores based on exected times (it defaults to just a numeric split w/out that file).

I think this probably makes sense to apply, but I can also wait until we have more runtime data post-matrix-removal and see what things actually look like.

Gemfile Outdated Show resolved Hide resolved
@mjankowski mjankowski added testing Automated lint and test suites performance Runtime performance labels Nov 7, 2023
Copy link
Contributor

github-actions bot commented Nov 9, 2023

This pull request has merge conflicts that must be resolved before it can be merged.

Copy link
Contributor

github-actions bot commented Nov 9, 2023

This pull request has resolved merge conflicts and is ready for review.

Copy link

codecov bot commented Dec 11, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (86500e3) to head (5125503).
Report is 258 commits behind head on main.

❗ Current head 5125503 differs from pull request most recent head 64e7116. Consider uploading reports for the commit 64e7116 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##             main   #27663       +/-   ##
===========================================
- Coverage   85.01%    0.00%   -85.02%     
===========================================
  Files        1059     1060        +1     
  Lines       28277    41653    +13376     
  Branches     4538        0     -4538     
===========================================
- Hits        24040        0    -24040     
- Misses       3074    41653    +38579     
+ Partials     1163        0     -1163     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mjankowski
Copy link
Contributor Author

Rebased again. These run times remain variable due to differences in underlying hardware/env, but recently the full runs of "Ruby Testing" task for successful runs is in the ~13-15 minute zone. This PRs most recent rebased run was just over ~8 mins.

The narrow rspec run portion of that (for each ruby version) tends to be in the 8-12 min zone (again, quite variable) on the non-parallel runs. This PRs most recent was ~4 mins. I suspect part of the variability is in how many cores happen to be allocated for a given runner ... so ~15 mins might be a worst-case on the non-parallel runs and just under ~4 mins might be expected on parallel runs when the runner gets 4 cores.

Maybe as importantly - none of these runs have shown any intermittent failures or other parallelism issues; so I think this is probably safe if we want a CI speedup.

Copy link
Contributor

github-actions bot commented Jan 9, 2024

This pull request has merge conflicts that must be resolved before it can be merged.

Copy link
Contributor

github-actions bot commented Jan 9, 2024

This pull request has resolved merge conflicts and is ready for review.

@mjankowski
Copy link
Contributor Author

With the sidekiq fake PR merged, this one is probably the next most impactful (in terms of CI performance).

Rebased this again to include the sidekiq improvements here.

When the flatware runners get 4 cores on CI, looks like the most recent runs were just over ~3 mins each for the full rspec run portion of the CI suite (vs 6-7 mins for other recent post-sidekiq-merge runs from other PRs).

@mjankowski mjankowski force-pushed the flatware-parallel-specs branch 4 times, most recently from cf9d2b8 to 1c400c6 Compare April 22, 2024 15:31
@mjankowski
Copy link
Contributor Author

There was a configuration interaction issue here briefly ... since simplecov is only loaded/run when the ruby matrix version matches ruby-version on CI, the other versions (currently 3.1 and 3.3) were failing/timing-out their CI runs. Pushed an update to only set up the flatware fork config when we want simplecov active. I think that will resolve it ... but will monitor and publish another status summary after this CI run.

@mjankowski
Copy link
Contributor Author

Looks like that last push resolved the CI issues, and we are now re-combining simplecov output locally correctly.

Looks like codecov is not combining though, and needs more configuration to do that. Will attempt that next.

@mjankowski mjankowski force-pushed the flatware-parallel-specs branch 4 times, most recently from e08e80a to 3ffbc03 Compare April 29, 2024 12:24
@mjankowski
Copy link
Contributor Author

The codecov integration is the last remaining portion of this to sort out.

The first few obvious seeming things did not work. Before I spend more time on that ... is the codecov integration useful? I rarely look at it, but when I do it seems confusing. Like right now its comparing recent open PRs to a commit from ~3 months ago? I don't know if the service only sort of works, or if we broke the integration somehow, or if I'm interpreting it wrong?

@briandunn
Copy link

@mjankowski simplecov support is a pretty new flatware feature. If that's a blocker here I'd love to help. Would be cool to see this merge.

@mjankowski
Copy link
Contributor Author

@briandunn thanks for the offer (and for flatware, it's great!). I think at this point...

  • The basic integration here was easy to do and is all set up and working correctly. We see expected speedup both locally and on CI.
  • The interaction between rspec/flatware/simplecov is set up according to your integration docs, and I believe working correctly - results from each parallel run seem to be combined and report the full coverage
  • The part of the CI process which uploads the coverage result data to the hosted codecov service is not fully updated yet ... because a) I'm not 100% sure on what needs to change to make it work, b) I'm waiting for an answer on whether anyone actually looks at this integration (and want to just disable it if no one cares), c) the codecov/GH feedback loop is pretty slow and sort of confusing and thus I have not spent a ton of time on it.

I found one load order oddity (not a bug, just mistake I made) when I was working on how to configure the flatware/simplecov integration ... maybe I'll open an issue or documentation update for you on flatware repo.

bin/flatware Show resolved Hide resolved
@mjankowski mjankowski force-pushed the flatware-parallel-specs branch 2 times, most recently from 0494cb3 to 34c71f9 Compare May 13, 2024 15:35
@mjankowski
Copy link
Contributor Author

Closing this in favor of #30284 (basically same thing, just new PR).

@mjankowski mjankowski closed this May 13, 2024
@mjankowski mjankowski deleted the flatware-parallel-specs branch May 13, 2024 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Runtime performance testing Automated lint and test suites
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants