Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of scratch deltas with existing commits #2005

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dbnicholson
Copy link
Member

This fixes 2 bugs with handling of scratch deltas. First, if the existing commit is partial, use the scratch delta instead of the assumption that an object pull would be better. This fixes flatpak/flatpak#3412 because it initially does a metadata only fetch to figure out some things about extra data flatpaks. Second, if the caller specifies require-static-deltas, honor it for a scratch delta.

The tests were a little tricky to put together because ostree internally will fall back to an object pull if the delta doesn't exist. I had to corrupt the superblock to make it error instead of falling back.

Closes: #2004

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dbnicholson
To complete the pull request process, please assign cgwalters
You can assign the PR to them by writing /assign @cgwalters in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alexlarsson
Copy link
Member

As I said here: #2004 (comment)
I don't think this is quite right. The point of the original check was that if the ref is available at all locally its likely we have a previous version and thus a from-scratch-pull is likely inefficient.

With this change if you had a previous version of the ref and with flatpak downloading a COMMIT_ONLY of the latest version of the ref then the is-partial check will always be true, even if we have some other version of the ref so using the from-scratch is not a good idea.

@dbnicholson
Copy link
Member Author

After sleeping on it, I agree. Even though we don't build scratch deltas for our OS, we'd never be able to use them with our updater since it also does a metadata only pull first.

A slightly more involved but still pretty simple check (as you're suggesting). If the current revision of partial but the parent exists and is not partial, then do an object pull as well. Actually, I could see cases where the parent is also partial because you didn't complete the previous pull and now the origin has published a new commit. It would be better to walk the ref until you find a normal commit.

So, then if there was a normal commit reachable from the ref, use objects. Otherwise, use the scratch delta. Does that make sense?

@alexlarsson
Copy link
Member

That seems fine to me. Although I'm gonna do the workaround in flatpak, so its not a huge hurry for me for this change.

The rest of the logic for what delta to choose is in
`get_best_static_delta_start_for`, so move the logic for skipping
scratch deltas there, too. This makes the handling in `initiate_request`
more straightforward by returning no match when it should be skipped so
that it goes directly into the object pull path.
@dbnicholson dbnicholson force-pushed the scratch-deltas-on-partial-commits branch from 3c0fadc to aeaca9b Compare February 13, 2020 18:01
@dbnicholson
Copy link
Member Author

Apologies if anyone was reviewing this. I restarted after the above discussion and force pushed over the previous series.

@alexlarsson
Copy link
Member

Actually I don't know if this is right either. The walk over the commits parent will only succeed if the direct parent of the new commit object is available locally. If you jump multiple revisions when you do the update then it will not find the intermediate parents and thus not the grandparent.

@dbnicholson
Copy link
Member Author

Ugh, you're right. I added a test where the parent is partial and the grandparent is normal, but it's entirely common that you don't have the parent at all.

I'm not sure I can think of a great way to keep this heuristic on the ostree side besides having metadata only pulls not update refs, but I'm sure that would break programs.

I think this can only be reliably handled is if the program does not update the ref with a metadata only pull. I.e., pull by revision instead of by ref. Then the heuristic could still be applied correctly - what's on the ref when you do the real pull will accurately represent the state you want.

@dbnicholson
Copy link
Member Author

Another thing that might help is a flag or option that says not to update the ref. So you can fetch the commit to your repo but not update your local ref.

@dbnicholson
Copy link
Member Author

@alexlarsson did you see my comment at flatpak/flatpak#3413 (comment)?

@jlebon
Copy link
Member

jlebon commented Feb 14, 2020

Ahh that's a tricky corner case. It's not at all obvious that doing a commit-only pull only would negatively affect static deltas. I guess a reflog would have been useful here. Hmm, couldn't we also use the ref-binding metadata too?

@cgwalters
Copy link
Member

cgwalters commented Feb 14, 2020

We could probably extend the "partial" state to be:

  • not partial
  • partial metadata
  • partial content

The "use object fetch" path came about for things like fsck --delete I think - that's partial content. In a partial metadata case we'd prefer scratch pulls particularly if the client requested them?

@dbnicholson
Copy link
Member Author

Ahh that's a tricky corner case. It's not at all obvious that doing a commit-only pull only would negatively affect static deltas. I guess a reflog would have been useful here. Hmm, couldn't we also use the ref-binding metadata too?

A git style reflog did come to mind, but I'd forgotten about the ref-binding metadata. That's a great idea.

@dbnicholson
Copy link
Member Author

We could probably extend the "partial" state to be:

  • not partial
  • partial metadata
  • partial content

The "use object fetch" path came about for things like fsck --delete I think - that's partial content. In a partial metadata case we'd prefer scratch pulls particularly if the client requested them?

I think that would be useful, but I think you'd still be in the same situation for partial metadata if you had a previously not partial commit on the ref but now you can't find it because of a non-linear history. I think the only way to have the partial metadata state not throw off the heuristic is:

  1. Add an update-refs boolean pull option that defaults to TRUE which says to do the fetch but not update the refs. I could imagine several ways this could be useful (like pre-seeding a repo), but in this case it would be particularly useful since you could fetch the commit metadata and inspect it without affecting the state of the local refs. The downside is that it requires the caller to opt-in to get the desired scratch delta behavior and it's not at all obvious externally why this would make a difference. Oh, actually I don't think that would work because ostree_repo_pull_with_options doesn't provide the fetched revisions back to the caller. I guess you could stuff them into the progress variant.

  2. Dig through all the commits looking at the ref bindings as @jlebon says. I like that, although to be sure you're getting the ref corresponding to the remote you're pulling from right now you'd also have to look at the collection binding and the collection ID of the remote. Not terrible, but collection IDs aren't universal.

In e7305bb a heuristic was added that
if a scratch delta exists but you already have a commit on that ref that
an object pull should be preferred. The idea is that if you already have
something on the ref, then it's likely that an object pull require less
data to be fetched than pulling the entire new commit.

However, this doesn't take into account the commit state. If the
existing commit is partial, then it might be better to fetch the scratch
delta. On the other hand, if the existing commit is partial only because
a metadata only pull has been done, then the parent commit might be
normal and then you'd want to prefer an object pull again. Instead of
considering just the HEAD local commit, walk the parents and check if
any of them are normal commits. Only then use an object pull.
If a scratch delta exists and the caller has required static deltas, use
it instead of the heuristic about existing commits.
If a normal commit cannot be found in the history of the current ref,
try to find one by looking at the collection and ref bindings in all the
commits in the repo. This will only work if the remote is using
collection IDs since otherwise just looking at the ref bindings in the
local commits would be ambiguous about what remote the commit came from.
If a normal commit with matching bindings is found, prefer an object
pull.

With this change it's possible to have a partial local HEAD commit (from
a previous metadata only pull, for example) and broken history to a
normal ancestor and correctly use the heuristic that an object pull
should be preferred. This is a common case since user repos are likely
to have missed intervening commits on the same remote ref.
@dbnicholson dbnicholson force-pushed the scratch-deltas-on-partial-commits branch from b6a5c4e to 04a55cb Compare February 20, 2020 21:42
@dbnicholson
Copy link
Member Author

Here's another rework that includes looking for matching commit bindings as a fallback if the history search is unsuccessful. The added tests indicate it's doing the right thing, but I'm not sure how comfortable I am potentially scanning for and loading all the commits in the repo just for the benefit of this heuristic.

@openshift-ci-robot
Copy link
Collaborator

@dbnicholson: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cgwalters cgwalters removed the size/L label May 2, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jun 29, 2023

@dbnicholson: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/sanity 04a55cb link true /test sanity
ci/prow/images 04a55cb link true /test images
ci/prow/fcos-e2e 04a55cb link true /test fcos-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants