Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git: allow sparse checkouts #4646

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

tonistiigi
Copy link
Member

No description provided.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi
Copy link
Member Author

@AkihiroSuda @jedevc Any thoughts on this? It does speed up certain repositories (eg. at least 50% for aports repo). I'm not 100% confident that there are not any side effects though so maybe it needs more testing.

Also wondering if we could add .SparseCheckout(path1, path2) to llb.Git so this can be used for better filtering and also for cases where one just wants to read a Dockerfile or a bake file from the repository. If server does not support sparse checkouts then buildkit would run a full clone and filter afterwards.

@jedevc
Copy link
Member

jedevc commented Feb 15, 2024

Aha, I've actually been really wanting this (I know @sipsma has as well).

I think I prefer this being set as an explicit option, instead of always doing it by default - I think the repo will end up being set in a different state after that, so the cache key would at least need to change. That also means we don't need to worry about exact 1-for-1 compat with the existing code.

If it was an explicit option, how would we handle the trailing directory arg as part of the URL? How would that interact with this?

@tonistiigi
Copy link
Member Author

so the cache key would at least need to change

The current PR should not have any user-visible changes so the cache key does not change. Only performance is different.

If it was an explicit option,

You mean the proposed SparseCheckout or something different? For subdir support, if both sparse and non-sparse would be supported then that would mean that different configurations can't share same files like different branches can atm. because for sparse checkouts the promisor mode needs to be enabled for the shared repository.

The difference between SparseCheckout and subdir is that subdir is a single directory that becomes new root, while SparseCheckout could be multiple paths that act as a filter and can also be file paths. You could also use both together.


I have another future use case for the promisor in mind. If we want to be able to resolve git tag metadata automatically in the future (eg. via new sourcemetaresolver), like https://github.com/docker/metadata-action?tab=readme-ov-file#global-expressions does for example, then we can't use --depth 1 as that would not know the tags for parent commits. What we could do is something like --depth 100 (configurable) with a promisor filter tree:0 like in this PR so that we get the commits for the parents but don't waste any time pulling down the files/trees for the parent commits. cc @crazy-max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants