Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't pull compose images #88

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Don't pull compose images #88

wants to merge 1 commit into from

Conversation

byroot
Copy link
Member

@byroot byroot commented Feb 9, 2024

That pull take 30-40s per job, which is quite insane when these image rarely if ever change.

Of course if they significantly change it would cause problem, but then we should probably find another way to deal with this like running a specific tag that is committed in this repo and that we manually update once in a while.

Ref: #73 (comment)

cc @zzak

That pull take 30-40s per job, which is quite insane when these
image rarely if ever change.

Of course if they significantly change it would cause problem,
but then we should probably find another way to deal with this
like running a specific tag that is committed in this repo
and that we manually update once in a while.
@byroot
Copy link
Member Author

byroot commented Feb 9, 2024

Doesn't seem to work that well, as now compose is pulling as part of up.

The doc suggest there is an option to disable that:

--pull	policy	Pull image before running ("always"|"missing"|"never")

I suspect we need missing here, but it's unclear what the default is... I also need to figure out how to hook that with the pipeline generation DSL.

@byroot
Copy link
Member Author

byroot commented Feb 9, 2024

I suspect we need missing here,

Arf, https://docs.docker.com/compose/compose-file/05-services/#pull_policy suggest it's the default behavior. It's really weird I'd expect all nodes to already have these images in cache and not need to pull.

@byroot
Copy link
Member Author

byroot commented Feb 9, 2024

Hum, wait:

docker-compose -f .buildkite/docker-compose.yml -p buildkite018d8da4653043ae9bbd12e7145077ce up -d --scale default=0 default
Creating network "buildkite018d8da4653043ae9bbd12e7145077ce_default" with the default driver
Pulling default (973266071021.dkr.ecr.us-east-1.amazonaws.com/builds:ruby-3-3-018d8da2-6d6b-4eb1-815f-4ce77a885048)...
ruby-3-3-018d8da2-6d6b-4eb1-815f-4ce77a885048: Pulling from builds

So each build has a different image for backing services? That seem wrong.

@matthewd
Copy link
Member

matthewd commented Feb 9, 2024

So each build has a different image

Yes, that's the image created by the ruby:3.3 (etc) job, containing the current repo contents, the installed gems, and the built JS packages.

Jobs will not-infrequently be the first thing running on an instance, meaning they do need to pull everything, but the rest of the time they'll be building atop an already-cached set of layers and just pulling the final ref-specific layer(s).

@byroot
Copy link
Member Author

byroot commented Feb 9, 2024

Right, I figured that a bit later, but that confused me because I wasn't expecting compose to deal with that (but again, I'm super unfamiliar with compose).

It seems like my PR works as intended, just maybe not quite as effective as I hoped?

What do you think?

@matthewd
Copy link
Member

matthewd commented Feb 9, 2024

IIRC the explicit pull just separates the pull of the "build image" from the dependencies (postgres or whatever) into separate groups in the log output, so they get their time accounted separately. I don't think it adds time, so I think it's just a small cost (extra bit of config) for an even smaller benefit (extra log line) -- I'm not attached to it.

@byroot
Copy link
Member Author

byroot commented Feb 11, 2024

I don't think it adds time

It depends. When you docker run foo if the foo tag exist locally, there's no pull. If you docker pull foo, it's going to query the registry, and potentially update the local version of foo.

But agreed that in most cases it should be that long, unless the image changed. I guess I overlooked that this pull included not only the "services" image but also the main "ruby" image, which changes way more frequently.

To really optimize this, we'd need to make these image change less often (e.g. use a digest of the Gemfile.lock instead of the build id as tag).

It's quite more involved.

@matthewd
Copy link
Member

To really optimize this, we'd need to make these image change less often (e.g. use a digest of the Gemfile.lock instead of the build id as tag).

We do already use a lock-only layer. If we stopped the image there, we'd have to clone the repo on the workers, and last time I tried that seemed to come out slower (given we're semi-often building on a clean machine).

@byroot
Copy link
Member Author

byroot commented Feb 12, 2024

Yeah, you're right that it depends on a lot on many factors such as this one.

Maybe it was just my builds at some point, but they seemed to all pull for 30-40s, I'm pretty sure we could clone faster than that. But I was more in search of a quick win, rather than willing to rearchitect the whole thing.

@matthewd
Copy link
Member

Yeah, the docker pull definitely takes a lot longer than I'd expect.. but IIRC so does the git clone 😕

(I do, as is my wont, have vague long-term plans to improve things here.)

@zzak
Copy link
Member

zzak commented Feb 28, 2024

FYI: buildkite-plugins/docker-compose-buildkite-plugin#419

images will not be pulled by the plugin (docker should take care of that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants