Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial backports automation #21

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Initial backports automation #21

wants to merge 3 commits into from

Conversation

squidadm
Copy link
Collaborator

No description provided.

@yadij
Copy link
Contributor

yadij commented Mar 14, 2023

This is the script which generated squid-cache/squid PR #1308.

Mark PR with "backport-to-vN" labels to have this script attempt porting automatically from v(N+1) to vN.
It will skip all PRs which have backport conflicts.

Backport requires that the PR be merged to v(N+1) branch first (with "master" being ultimate N+1). That means multiple labels can be assigned at once, and during review process before final merge into Squid is approved.

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that reviews of this imported "as is" shell script are not welcomed, so I am not offering one. The comments below are about the backporting procedure rather than its specific implementation.

Please note that bundling multiple backports into one PR means that either that PR will have to be merged manually (to preserve individual commits) or we will need to enhance Anubis to support merge commits (in addition to the currently supported squashed ones). In the past, such manual merges precluded proper CI testing. Perhaps GitHub offers enough features now to properly test PRs without Anubis help; I have not investigated all the details, but I doubt GitHub has reached that milestone, especially when it comes to third party CI tools like our Jenkins.

I am also concerned that labeling a PR for backport now effectively implies an automatic attempt to backport, which is sometimes known to be a bad idea even though no merge conflicts are expected. Perhaps there should be an additional GitHub label to mark a PR for manual backporting.

@yadij
Copy link
Contributor

yadij commented Mar 15, 2023

I assume that reviews of this imported "as is" shell script are not welcomed, so I am not offering one. The comments below are about the backporting procedure rather than its specific implementation.

In this case the script is a full re-write, so implementation bugs are fair game. The goal/scope here is just to get this robotic action working again to reduce maintainer time.

FTR;

  • this is a helper script leading to a manual process stage (PR review) before anything gets merged.
  • the v4-next-backports PR is merged manually when approved by maintainer.
  • the v4-next-branch can (and is) manually adjusted as needed to;
    • prune bad backport attempts (seen in v6 Next Backports squid#1308 already)
    • prune backports that fail in testing stages (shift to manual backport)
  • there is a regular code-maintenance email to NOC reporting backport attempts that failed and/or pending manual action.

Please note that bundling multiple backports into one PR means that either that PR will have to be merged manually (to preserve individual commits)

Noted. Maintainer has to manually review and approve anyway. So this is not a major issue, the grouping is done primarily to reduce maintainer work/time (and incidentally reduce CI load, but that is minor and at the cost you mention).

I am also concerned that labeling a PR for backport now effectively implies an automatic attempt to backport, which is sometimes known to be a bad idea even though no merge conflicts are expected. Perhaps there should be an additional GitHub label to mark a PR for manual backporting.

FYI, this is just automatic staging backports for CI. It is not intended to go further without manual attention.

If someone wants a backport and knows that there are conflicts, or any other changes needed. They should do it manually and submit a PR against the vN branch as per normal Squid Project process. Not use the label for automatic merge.

@rousskov
Copy link
Contributor

I am also concerned that labeling a PR for backport now effectively implies an automatic attempt to backport, which is sometimes known to be a bad idea even though no merge conflicts are expected. Perhaps there should be an additional GitHub label to mark a PR for manual backporting.

FYI, this is just automatic staging backports for CI. It is not intended to go further without manual attention.

I do not recommend such (ab)use of the primary Project repository/space; at the very least, an official PR that is "not intended to go further without manual attention" should be opened as a draft PR, but I am not going to object either way. I will just pray that nobody (including me) accidentally clears such a PR for merging (and the PR gets merged) before that secretly missing "manual attention" is actually given.

If someone wants a backport and knows that there are conflicts, or any other changes needed. They should do it manually and submit a PR against the vN branch as per normal Squid Project process. Not use the label for automatic merge.

Yes, that was one of my points: We used to have a "maintainer has approved these changes for vN backporting" label. Now we will have a "maintainer has approved these changes for automatic vN backporting" label. The label name has not changed, but its meaning has. We lost a useful (IMHO) label. Hence the suggestion.

gh pr list --repo squid-cache/squid --state closed --label backport-to-v$version | while read prnum text; do
# find a commit in $srcbranch with " (#$prnum)" subject suffix
git log --oneline github/$srcbranch --grep=" (#$prnum)\$" | while read sha rest; do
git cherry-pick $sha && doneCherryPick $sha $prnum $version || cherry-pick --abort 2>/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please strip (#prnum) sequences from cherry-picked commit titles and add original commit SHA references to backported commit message footers instead. See squid-cache/squid#1308 (comment) for an illustration of the problem this change request is meant to solve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool. We cannot do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool.

What makes you think that? I believe my change request can be addressed while still using gh. In fact, I would not be surprised if it can be addressed without changing how gh is used by this script. This change request is about the areas of the script that do not use gh AFAICT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool.

What makes you think that?

The gh pr list used to locate which commits are labeled for backport does not have any SHA information. The only unique value available is PR number - which needs to be cross-referenced to the title field of the branch log(s) to find the SHA appropriate to the src-branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool.

What makes you think that?

The gh pr list used to locate which commits are labeled for backport does not have any SHA information.

As a starting point, let's assume that the gh command stays the same while addressing this change request. This change request is not about that gh command. It is about git commands that create a new/backport commit. The problem is not in labeled PR search -- that search can stay the same as far as this change request is concerned -- the problem is in the commit message created by this script using git commands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool.

What makes you think that?

The gh pr list used to locate which commits are labeled for backport does not have any SHA information.

As a starting point, let's assume that the gh command stays the same while addressing this change request.

Assume that I do fully understand what you are asking for.

The problem with your request is that enacting it on backports from vN to vN-1 impacts the results of the gh command used to backport from vN-1 to vN-2. Making it impossible to achieve the fundamental purpose of this script: backport a change from v7->v6->v5.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with your request is enacting it on backports from vN to vN-1 impacts the results of the gh command used to backport from vN-1 to vN-2. Making it impossible to achieve the fundamental purpose of this script: backport a change from v7->v6->v5.

I do not understand why addressing this change request would make it impossible to backport a change from v7 to v6 to v5. AFAICT, as long as we always label PRs that need an automated backport, the script can (continue to) use the gh pr list command to find PRs labeled for backporting. Once a labeled PR is found, the script can (continue to) use that PR number to find the corresponding commit SHA that needs backporting. The script can then cherry-pick that commit and post the corresponding PR to merge the result. All that without copying the original PR number to commits corresponding to other PRs (i.e. satisfying this change request)! Where do you see a problem that is impossible to solve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with your request is enacting it on backports from vN to vN-1 impacts the results of the gh command used to backport from vN-1 to vN-2. Making it impossible to achieve the fundamental purpose of this script: backport a change from v7->v6->v5.

I do not understand why addressing this change request would make it impossible to backport a change from v7 to v6 to v5. AFAICT, as long as we always label PRs that need an automated backport, the script can (continue to) use the gh pr list command to find PRs labeled for backporting.

... IF and only IF we do not erase the PR ID number from commit titles (i.e. what you are requesting).

Once a labeled PR is found, the script can (continue to) use that PR number to find the corresponding commit SHA that needs backporting.

Right now: Yes, SHA found.
After enacting the requested change: No, SHA not found.

The script can then cherry-pick that commit and post the corresponding PR to merge the result.

The git cherry-pick preserves the title including origin PR ID.

All that without copying the original PR number to commits corresponding to other PRs (i.e. satisfying this change request)! Where do you see a problem that is impossible to solve?

I have already explained that thrice. I do not see any benefit in continuing if you still do not understand.

You can replicate the problem by simulating a backport of the changes in PR 715 from v5 branch to v4 branch.

  • The PR has already been merged from v6 to v5 branch.
  • It was merged with a title change equivalent to what you are requesting we always do.

Use git log to find which SHA of v5 branch to cherry-pick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex: Once a labeled PR is found, the script can (continue to) use that PR number to find the corresponding commit SHA that needs backporting.

Amos: After enacting the requested change: No, SHA not found.

Why not? If one labels a GitHub PR for backporting, and that GitHub PR is officially merged, then I see no reason why the script cannot find that PR commit in the official repository. The commit is obviously there, we know its title, and that title is unique because both manual GitHub PR commits and automated Anubis PR commits contain a unique PR ID in their title.

If you are thinking of backporting commits that do not have dedicated PRs, then they do not seem to be applicable to this discussion because those commits cannot be labeled for automatic backporting. If really needed, we can support them as well, but that should be a subject of a very different discussion!

The git cherry-pick preserves the title including origin PR ID.

The result of that command depends on how that command is used. The script can use that command and make sure that the unique PR ID is not copied to the new/resulting commit title. The script fully controls what goes into the backporting PR! We are not being forced to push anything specific into that PR. We decide what (metadata) goes in and what does not.

You can replicate the problem by simulating a backport of the changes in PR 715 from v5 branch to v4 branch.

PR 715 is not a valid test case because it is not a PR against a v5 branch: That PR can be labeled to be backported to v4, but it would not be a "v5 to v4" backport. For a change to be automatically backported (through a properly labeled PR -- the subject of this script and this change request!) from vX to vY, that change needs a (labeled) vX PR. In your example, X is 5, but PR 715 is not a v5 PR.

However, the corresponding PR 717 (v5 commit f4ade36 with 5.0.4 (#717) title) could have been automatically backported from v5 into v4 (after its PR was labeled for such backporting). I see no problem with that.

The PR has already been merged from v6 to v5 branch. It was merged with a title change equivalent to what you are requesting we always do.

Our definitions of "equivalent" differ: PR 715 title is Release Prep for 5.0.4 and 4.13. If that change were to be automatically backported into v5 after addressing my change request, the corresponding v5 commit title would have been Release Prep for 5.0.4 and 4.13 (#X) (where X is the corresponding new/backporting PR number). There are no v5 commits with titles that start with Release Prep for 5.

You are probably hinting at v5 PR 717 (commit f4ade36 with a very different 5.0.4 (#717) title). I agree that PR 717 could have been labeled for automated v4 backporting (as discussed above).


I have already explained that thrice.

AFAICT, the previous responses contain repeated assertions (that X is impossible) rather than explanations (of why X is impossible). That lack of cooperation hinders progress. Since I believe that the "impossible" or "cannot do that" assertion itself is false, I see only two ways to make progress: Find the underlying mistake that led to the wrong conclusion or implement the requested changes myself. The former requires detailing/explaining the assertion, triggering my questions. The latter takes even more time and may require adjustments that may create even more conflicts, so I am avoiding that option.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Mar 17, 2023
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found more places that need adjustments and also updated an old unaddressed (and critical) change request.

gh pr list --repo squid-cache/squid --state closed --label backport-to-v$version | while read prnum text; do
# find a commit in $srcbranch with " (#$prnum)" subject suffix
git log --oneline github/$srcbranch --grep=" (#$prnum)\$" | while read sha rest; do
git cherry-pick $sha && doneCherryPick $sha $prnum $version || cherry-pick --abort 2>/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change would break use of the gh CLI tool.

What makes you think that? I believe my change request can be addressed while still using gh. In fact, I would not be surprised if it can be addressed without changing how gh is used by this script. This change request is about the areas of the script that do not use gh AFAICT.

maintenance/code-backports.sh Outdated Show resolved Hide resolved
version="$3"

git push $beQuiet >/dev/null
# TODO automatically remove label from PRs on successful commit+push to vN-next-backports
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I would consider leaving some PR label1 after merging the backported commit, as a clear sign of what has happened to the original PR -- an easily accessible documentation of where it has been backported. Yes, if the backported commit refers to the original PR by that PR number, then GitHub will link the two, but those links are often buried in GitHub page noise, and we should not really use GitHub PR numbers for such references anyway (we should use commit SHAs instead, but SHAs that do not generate PR links).

Footnotes

  1. Labeling the original PR means either leaving the original PR label in place or, better, replacing that with a new/dedicated label. For example, replacing backport-to-v5 with backported-to-v5 or something like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sans an agreed label to add. Using an automatic comment instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current "queued for backport to vN" comment does not address this change request because it reflects the desire to backport, not the fact of a successful backport. FWIW, a comment is also much more difficult to find, especially in a noisy PR or when searching across multiple PRs. I do support adding comments, but comments solve a different set of problems than labels do.

Obviously, we can find a suitable name for the new label. I am not aware of any relevant failures, and I do not even recall any serious attempts to do that beyond my backported-to-vN suggestion in this change request, but we can revisit them if I have forgotten about them.

maintenance/code-backports.sh Outdated Show resolved Hide resolved
maintenance/code-backports.sh Outdated Show resolved Hide resolved
Comment on lines +12 to +33
# Dev branch does not receive backports
test -f .BASE && exit 0

# clean the current workspace
gitCleanWorkspace ()
{
git clean $beQuiet -xdf --exclude="\.BASE"
git checkout $beQuiet -- .
}

if ! test -d .git ; then
echo "ERROR: missing git repository"
exit 1;
fi
if ! test -d ../squid-$vnext ; then
echo "ERROR: missing git repository to backport from"
exit 1;
fi
if test -e ../squid-$vnext/.BASE ; then
srcbranch=`cat ../squid-$vnext/.BASE`
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this convoluted .BASE hack and related manipulation, I suggest making srcbranch a script parameter with master (or the main official repository branch name if you want to compute that to avoid hard-coding master) as its default value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you propose that script parameter comes from? The script running this script only has the "/path/to/repo" and "vN" branch name available. The requirement to use "master" for development prohibits using git branch to determine if there is a newer vN branch existing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact answer depends on how this script is used in its environment. If I were to design this, and I wanted the script to run unattended by default, then

  • If I were looking for a quick 80% solution, then I would probably hard-code (in the script where this new script is started from) the names of the first and the last vN branch. There are just a few of them, and the list does not change often... For example, today, those hard-coded names would probably be v5 and v6 since it is very unlikely that we will automatically backport something from v4 and earlier branches. The script will then iterate all versions between those two hard-coded points plus master.
  • If I were looking for a comprehensive solution, then I would probably compute the list of all vN branches (plus the primary branch) from the list of all official branches in GitHub squid-cache/squid repository.

However, I am not implying that something like the above is the best solution, or even that this PR should become hostage to changing the environment to make something like that possible. I am just answering your question, with the rather limited information that I have about the existing environment and, more importantly, about environmental changes you are willing to adopt. Ideally, the whole environment should be designed with certain maintenance automation goals/principles in mind, and I doubt we share those today.

maintenance/code-backports.sh Outdated Show resolved Hide resolved
maintenance/code-backports.sh Outdated Show resolved Hide resolved
@yadij yadij requested a review from rousskov July 22, 2023 02:13
@yadij yadij added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Jul 22, 2023
@yadij
Copy link
Contributor

yadij commented Jul 22, 2023

@rousskov, can we merge this and fix the remaining issues as followup PRs as time permits?

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we merge this and fix the remaining issues as followup PRs as time permits?

The code-backports.sh script in question has insignificant affect on the master branch and major Squid releases AFAICT. The primary change request has not been addressed for many months. The ongoing backporting commits continue to follow the same approach that the change request is asking to adjust, despite my repeated complaints. I have spent significant time on reviewing this PR in general and on the still-unaddressed change requests in particular. I see no signs of serious attempts to address those concerns.

Given the above, I cannot think of a better option than to stop wasting resources on trying to achieve consensus regarding the maintenance/code-backports.sh script. Instead, maintainers should commit pretty much whatever script changes they happen to be using in production, without my review. Future vN maintainers may reuse this script and/or change backporting automation to whatever they want -- the script does not represent Project consensus that we have to protect. It just reflects whatever vN maintainers are using to maintain vN branch.

Does this approach work for you?

@rousskov rousskov added S-waiting-for-author author action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Jul 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-for-author author action is expected (and usually required)
Projects
None yet
3 participants