Proposal for improved UR sync process #13368

aarongreig · 2024-04-11T17:06:52Z

Is your feature request related to a problem? Please describe

The process we have in place for updating the unified runtime tag is very slow, and it has a high overhead for maintainers and contributors. Right now if you want to change the UR adapters the process looks like this:

open a PR in the UR repo with your changes
open a PR in this repo updating the UR tag to point at your UR fork with the changes
UR maintainer merges PR in the UR repo
the PR in this repo is updated so that the UR tag points to a commit in UR main rather than a fork
assuming everything CI/review wise is fine ping the gatekeepers to merge the PR here

There's a lot of room for friction and delays in this process, and the result is that if things run very smoothly we get around 1 PR merged per day, commonly ranging down to one or two per week. This translates into an average wait time from opening a UR PR to getting it merged of 2-3 weeks unless your change is urgent (and thus gets to skip the queue).

Describe the solution you would like

What I'm proposing to improve this is to optimize for the common case that a PR here only updates the UR tag and doesn't make any changes to the sycl runtime. Rather than a 1:1 merging of UR PR to LLVM PR, I'd like to move to one LLVM PR every two weeks that pulls in all the UR changes from that period. This PR would update the UR tag to the latest stable commit in UR, and it would bring with it the sycl runtime changes necessary to support the new UR commit.

To make sure all changes are still tested to the same extent, UR PRs will initially still need a PR in this repo just to run the pre-commit testing with their change (later maybe we can move to running the e2e tests in the UR repo instead). We would also like to have a way to regularly run the post-commit testing with UR tip, probably as a new nightly job in this repo. That way we get advance warning of any regressions there and we can fix/revert before opening the tag update PR at the end of a cycle.

So the new process for getting an adapter-only change merged to UR would be:

open a PR in the UR repo with your changes
open a PR in this repo updating the UR tag to point at your fork with the changes, for testing purposes only
once the tests are green and the reviews are in a UR maintainer merges the UR PR

Then, a maximum of two weeks (but assuming a fully random distribution probably an average of around one week) later your change gets pulled into sycl with the regular tag bump without you having done anything further, and with the UR maintainer having shepherded through one PR in this repo for many UR PRs, instead of 1:1.

Alright, so what about adapter changes that do need a sycl runtime update to accompany them so they don't break stuff upstream? To answer that we need to consider implementation details.

I propose that a staging branch is maintained in this repo. At the start of each two week cycle the staging branch is synced with the sycl branch and a commit is added to set its UR tag to UR main (not the latest commit hash, the actual branch name). This staging branch is what the nightly CI job will target, so any UR adapter changes that get merged to main on a given day will go through post-commit testing that night via the workflow run on the staging branch.

If you have an adapter change that needs an accompanying sycl runtime change then your PR will need to go through the same process that we currently have with the two synchronized PRs, except instead of targetting the sycl branch with your PR here you target the staging branch. PRs targetting the staging branch should go through the same pre- and post- commit testing as a regular PR targetting sycl does. On the last day of the cycle¹ a final commit is pushed to the staging branch which updates the UR tag, replacing main with our latest stable commit. At this point we open a pull request to merge the staging branch into sycl which proceeds like a regular PR and results in the last two week's worth of UR-related changes getting pulled in.

I'm not married to this exact approach to the implementation, but I do think that replacing the granular updates we have now with this kind of regular bulk pulldown would be beneficial to UR contributors and maintainers, so I'd like to start the conversation about how we might go about transitioning to this kind of system.

¹ I would propose that this is actually a few days before the end of the cycle, e.g. if our start/end day is a monday then we stop merging to the staging branch and push the final commit on the thursday before to make sure what we have really is stable before the merge.

Describe alternatives you have considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

aarongreig · 2024-04-11T17:07:02Z

@steffenlarsen @aelovikov-intel @stdale-intel

aelovikov-intel · 2024-04-11T17:27:34Z

the PR in this repo is updated so that the UR tag points to a commit in UR main rather than a fork
assuming everything CI/review wise is fine ping the gatekeepers to merge the PR here

that is optional even now, from this repo point of view. In fact, any PR/testing in this repo is UR folks' desire and not our requirement.

aarongreig · 2024-04-12T08:57:54Z

It's useful for us, especially with how the process works now, because without the assurance that a given UR change will pass pre-commit here before merging we could end up with the situation where we merge something and then it turns out it needs further changes on the UR side before it can pass CI here. That would cause further delays and the need to rush through another fix PR, which is well worth the extra step to avoid.

aelovikov-intel · 2024-04-12T15:01:02Z

It's useful for us

As long as it's not in origin/sycl, you can setup processes whatever you want. It's the time when a PR is created against sycl branch that the change should be small and atomic (i.e., no merges of multiple independent changes at once).

aarongreig · 2024-04-12T15:25:55Z

Ok, I can see this working in such a way that our bulk UR pulldowns only include one sycl runtime change at a time. I'll look at implementing the new nightly job to support a trial run like that soon.

aarongreig added the enhancement New feature or request label Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for improved UR sync process #13368

Proposal for improved UR sync process #13368

aarongreig commented Apr 11, 2024

aarongreig commented Apr 11, 2024

aelovikov-intel commented Apr 11, 2024

aarongreig commented Apr 12, 2024

aelovikov-intel commented Apr 12, 2024 •

edited

aarongreig commented Apr 12, 2024

Proposal for improved UR sync process #13368

Proposal for improved UR sync process #13368

Comments

aarongreig commented Apr 11, 2024

Is your feature request related to a problem? Please describe

Describe the solution you would like

Describe alternatives you have considered

Additional context

aarongreig commented Apr 11, 2024

aelovikov-intel commented Apr 11, 2024

aarongreig commented Apr 12, 2024

aelovikov-intel commented Apr 12, 2024 • edited

aarongreig commented Apr 12, 2024

aelovikov-intel commented Apr 12, 2024 •

edited