Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add support for multiple (named) build-contexts #37129

Closed
thaJeztah opened this issue May 23, 2018 · 25 comments
Closed

Proposal: add support for multiple (named) build-contexts #37129

thaJeztah opened this issue May 23, 2018 · 25 comments
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@thaJeztah
Copy link
Member

thaJeztah commented May 23, 2018

Add support for multiple (named) build-contexts

Related issues:

Problem statement

Take the following directory structure for a project;

project
 ├── .dockerignore
 ├── Dockerfile
 ├── ginormous
 │   ├── big-file-1
 │   ├── big-file-2
 │   └── big-file-xx
 ├── Makefile
 ├── service1
 │   ├── Dockerfile
 │   └── src
 │       ├── Makefile
 │       ├── source-file-1
 │       ├── source-file-2
 │       └── source-file-xx
 ├── service2
 │   ├── Dockerfile
 │   └── src
 │       ├── Makefile
 │       ├── source-file-1
 │       ├── source-file-2
 │       └── source-file-xx
 ├── common
 │   └── src
 │       ├── 0-various
 │       ├── 1-files
 │       ├── 2-used-by
 │       ├── 3-service-1-and-2
 └── common-2
     └── src
         ├── 0-various
         ├── 1-files
         ├── 2-used-by
         ├── 3-service-1-and-2

In the above;

  • the Dockerfile at the root of the project uses ginormous, and shared
  • service1 has a Dockerfile, and source-files used to build the service in service1/src
  • service2 has a Dockerfile, and source-files used to build the service in service2/src
  • both service1 and service2 share some code/resources, located in common and common-2
  • nor service1, nor service2 use ginormous (a big directory)

Challenges with this example

Building service1 and 2 is a challenge;

  • When building a Dockerfile, all files used have to be within the build-context. This means that in the project structure above, the only context that can be used is the root project directory. Doing so results in the entire project, including ginormous to be sent to the daemon (even though it's not used at all). Relative paths outside of the build-context cannot be used (also see Add with relative path to parent directory fails with "Forbidden path" #2745)
  • Similarly; when contructing/sending the build-context, docker won't resolve symlinks, and copy symlinks as-is, so putting symlinks to common and common-2 inside project1 and project2 will not resolve this problem.
  • Only a single .dockerignore is supported, so it's not possible to "conditionally" exclude files (e.g. when building service1, exclude the ginormous and service2 directories, and vice-versa)

Proposal: allow multiple (named) build-contexts

I propose to add support for multiple build contexts, implemented as a --context flag on COPY and ADD, and a --context <name>=<path> option on the docker image build subcommand.

For example, to build project1, the Dockerfile could look like this:

FROM baseimage

# no --context option set: use the default build-context
COPY . /build/service1/src/

# use the build-context named "common"
COPY --context=common   . /build/common/src/

# use the build-context named "common-2"
ADD --context=common-2 . /build/common-2/src/

RUN cd /build && make && make install

When building the Dockerfile, docker expects two named build-contexts to be provided, in addition to the default (positional) build-context:

From within the project directory:

docker build \
  -f ./service1/Dockerfile \
  --context common=./common/src \
  --context common-2=./common-2/src \
  ./service1

In the above:

  • -f ./service1/Dockerfile is the Dockerfile used to build the image
  • the ./common/src directory is used as build-context "common"
  • the ./common-2/src directory is used as build-context "common-2"
  • the ./service1 directory is used as default build-context

Only the common/src, common-2/src and service1 directories are uploaded to the daemon. All other directories are not part of the build-context, so won't be uploaded.

Similarly, when building from within the project/service1 directory:

docker build \
  --context common=../common/src \
  --context common-2=../common-2/src \
  .
  • No -f is provided, so the Dockerfile in the current directory is used to build the image.
  • the ../common/src directory is used as build-context "common"
  • the ../common-2/src directory is used as build-context "common-2"
  • the current (.) directory is used as default build-context

Both relative and absolute paths can be used to specify the location of a build-context, so all of these are valid:

--context foo=~/go/src/github.com/foobar/foo
--context bar=/dev/sdb/share/bar/
--context bar=./some/dir
--context baz=../../../foobar

Validation

Validation: missing build-contexts

Before building (and sending the build-context), docker validates if all build-contexts are provided. If a context is missing, an error is produced, and the build is aborted. Trying to build the Dockerfile from the example above without specifying any build-context:

docker build .
Error: missing build-context "common"
Error: missing build-context "common-2"

In a multi-stage build, only contexts that are required for the stages that are built should be taken into account. For example:

FROM baseimage AS stage-one
COPY --context=one /subdir/foo /target/dir

FROM busybox AS stage-two
ADD --context=two /foo.tar.gz /target

FROM scratch AS final
COPY --from=stage-one /foo /bar
COPY --from=stage-two /bar /baz
COPY --context=config /config.ini /config.ini

Given the Dockerfile above:

Building just stage-one:

docker build --target=stage-one .
Error: missing build-context "one"

Building up until stage-two:

docker build --target=stage-two .
Error: missing build-context "one"
Error: missing build-context "two"

Building the whole Dockerfile:

docker build .
Error: missing build-context "one"
Error: missing build-context "two"

The default build context (positional argument) is never optional:

docker build --context one=./one --context two=./two
"docker build" requires exactly 1 argument.
See 'docker build --help'.

Usage:  docker build [OPTIONS] PATH | URL | - [flags]

Build an image from a Dockerfile

Validation: unused build-contexts

Similar to unused --build-arg, specifying a build-context that is not used will produce a warning. When determining which contexts are expected/used in

Given the Dockerfile from the previous example:

docker build --target=stage-one --context one=./one --context two=./two .
Sending build context to Docker daemon  2.048kB
Step 1/4 : FROM baseimage AS stage-one
.....
Removing intermediate container edc9bb0f15dc
 ---> a77418a9ddad
[Warning] One or more contexts [two] were not consumed
Successfully built a77418a9ddad

Validation: conflicting options

The --context and --from options cannot be combined. Using both will produce an error:

FROM baseimage AS stage-one
RUN echo foo

FROM busybox
COPY --from=stage-one --context=foo /foo /bar
docker build --context=one .
Sending build context to Docker daemon  2.048kB
Error response from daemon: Dockerfile error line 5: conflicting options '--from' and '--context'

The --context option can be combined with the --chown option.

Validation: context names

Build context names follow these rules:

  • lowercase Alphanumeric characters (a-z, 0-9)
  • punctuation symbols: dashes and underscores
  • must start, and end, with an alphanumberic character
  • consecutive punctuation symbols are not allowed

Specifying an invalid context-name (either on the command-line, or inside a Dockerfile) produces an error.

@thaJeztah thaJeztah added area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels May 23, 2018
@thaJeztah
Copy link
Member Author

ping @tonistiigi @AkihiroSuda @duglin PTAL

@tonistiigi
Copy link
Member

tonistiigi commented May 23, 2018

COPY --context should be plain COPY --from (with no support for ADD). If you want extra validation then we can prefix the name, eg COPY --from=local://foo or smth like that (expect similar access to http/git in the future). They have same semantics for --from, FROM, RUN --mount as any other stage and ARG can be used to switch between them:

FROM local://foo${myarg} AS mysource

FROM busybox
COPY --from=mysource foo bar

BuildKit already has unlimited context support and exposes 2 separate contexts through Dockerfile: "context", and "dockerfile". This should be implemented in BuildKit's Dockerfile frontend and accessed with the syntax directive.

@neclimdul
Copy link

.

@AkihiroSuda
Copy link
Member

Can we also have an interface for submitting multiple build jobs that shares build contexts, without sending the contexts multiple times? (Probably as a separate proposal)

$ cat << EOF | docker build-batch
contexts:
  - name: common
     type: local
     directory: ./common/...
  - name: common2
     ...
  - name: service
    ...
  - name: service2
    ...
jobs:
  - target: service1
    contextRefs:
       - common
       - common2
       - service
    dockerfileContextRef: service
  - target: service2
    contextRefs:
       - common
       - common2
       - service2
    dockerfileContextRef: service2
EOF

@thaJeztah
Copy link
Member Author

with no support for ADD

Will we have a solution for the "decompress/extract local files"? (e.g. will there be a COPY --extract, or COPY --opt=decompress option? That was the reason I added it, but I know we basically wanted ADD to be feature-frozen.

expect similar access to http/git in the future

Good call; didn't think of the remote source part of ADD, which would introduce troublesome things like ADD --context=foo http://example.com/foo.tar /

If you want extra validation then we can prefix the name, eg COPY --from=local://foo

I'm a bit on the fence on this.

From a technical perspective, this looks flexible (being able to add new scheme://'s without having to add a new flag, and the option to use FROM scheme:// has a lot of potential). It also removes the need for a "conflicts" check (--from and --context being mutually exclusive).

The suggested local:// originates from "local build context" vs "a remote build context, such as git://", correct? (It looked odd initially, but I guess that name makes sense from that perspective)

Where I'm not sure is;

  • From a UX perspective, (imo) it looks less clean, more complicated (thinking if we need both; a "porcelain" --context, and an advanced --from scheme://, but perhaps thats taking it too far)
  • The --from flag is already ambiguous (it currently accepts both a "stage" and an "image reference"). While this allows for some nifty tricks (e.g. copy from an image by default, but copy from a stage if it exists), it can also lead to "odd" behavior if you don't know about this feature, and can complicate things such as Docker Content Trust (see Make content trust play nicely with multi-stage builds and "FROM $ARG" docker/cli#933, build args in FROM do not work with content trust buildkit#4255)
  • We need a well-defined, well-documented order in which references (--from=<reference>) are resolved.

@tonistiigi any feedback on:

  • the proposed CLI UX?
  • printing warnings for non-consumed build-contexts?

@thaJeztah
Copy link
Member Author

Probably as a separate proposal

@AkihiroSuda yes, I think we should look at that separately. If we don't want a separate batch file format, I was thinking a while back to allow tagging multiple stages in one go. Could be by allowing multiple --target flags on docker build (e.g. --target stage=stage-1,tag=baseimage:latest --target stage=stage-2,tag=webserver:latest)

@tonistiigi
Copy link
Member

Will we have a solution for the "decompress/extract local files"?

I think that was covered with making sure metadata (eg. uid/gid) is not changed on COPY.

From a UX perspective, (imo) it looks less clean, more complicated

This is an advanced feature in any case. Remembering that contexts require different flags or can't be used in some places like other stages is even more complicated.

The --from flag is already ambiguous.

That is why I suggested prefix scheme so that it is not ambiguous. Conceptually they are (should be) the same thing. A context behaves the same as a build stage or image.

can complicate things such as Docker Content Trust

That doesn't have anything to do with build stages. It doesn't work with plain ARG as well without any multi-stage. The problem is that we are not doing the trust validation on the correct side/time.

We need a well-defined, well-documented order in which references (--from=) are resolved.

We only need that if we don't use the scheme prefix. If we don't use prefix then the obvious order is build-stage > context > image. The advantage of using a prefix is that we can show more precise error messages, eg. "build requires context foobar that was not set", instead of "can't find build stage foobar".

the proposed CLI UX?

If we use scheme, the CLI flag should be same as the scheme. So if we use --context foo=bar, then Dockerfile should be COPY --from=context:foo or COPY --from=context://foo.

Another optional thing might be supporting tarballs that currently can be set with -. I don't think it is critical and don't have a good syntax in mind though.

printing warnings for non-consumed build-contexts?

That's actually the only thing not supported by BuildKit atm. It doesn't seem the most important part of this but can be done. It is more important to get readable errors when a context is missing.

@tonistiigi
Copy link
Member

One other thing to note in association with #12886 (custom .dockerignore support) is that the way contexts are loaded in BuildKit is that the daemon parses .dockerignore and after that decides what files to load(in combination with what ADD/COPY use). So if we could come up with a syntax how to associate a dockerignore location with the context in Dockerfile, it would be possible to implement it. The reason we have been resistant to #12886 is that it encourages broken Dockerfiles, but if these paths would be inside Dockerfile then there wouldn't be a problem.

This is just to give an idea of what is possible. This proposal by itself already solves a lot of the cases in #12886 by allowing subcontexts and a dockerignore per context.

@IMBurbank
Copy link

Hi, I was wondering if there will any movement on this proposal.

It would be incredibly useful, but it doesn't currently seem to be conceptually accepted or rejected.

Is this waiting for additional discussion, a working PR, a future release or is it just dead?

@tonistiigi
Copy link
Member

it doesn't currently seem to be conceptually accepted or rejected

The main update is (as I already mentioned here and other proposals) that 18.06 experimental and 18.09 stable have support for BuildKit backends that allows external implementations for the high-level build features that are loaded as regular images. So anyone can implement this (or include, dockerignore proposals, pretty much any Dockerfile proposal I've seen in this tracker) and propose the implementation externally. Then everybody can use their implementation or the collection of all experimental features provided through official channels. If it proves to be a good solution it will graduate to the stable official channel (moby/buildkit#528), if not then everyone can still continue to use it by pointing to the external version.

So nobody needs any acceptance anymore to get started. Of course, still announce if you are working on it to avoid duplication and code still needs to pass review standards if you want to merge into the official branch. For the maintainers to implement it, they are all very busy, this is somewhere in the backlog and no specific timeline/priority is assigned. But as I explained, there is now a very clear way how the community can step up and influence the priorities for new features.

This one also needs a CLI flag so initially, the testing would probably be with BuildKit's own CLI buildctl that is much more generic and doesn't require changes.

@IMBurbank
Copy link

Great, thanks for the update!

@thaJeztah
Copy link
Member Author

thaJeztah commented Oct 23, 2018

To add ignore-file support (per named context), perhaps we should use a csv notation to group all options together;

docker build \
  # named build-context ("mycontext"), including a custom `.dockerignore` file
  --context src=/some/path,name=mycontext,ignore=/some/ignore-file \
  # main/default build-context uses `./.dockerignore` (if present)
  .

To be discussed; if a custom context (e.g. --context src=/some/path) and /some/path has a .dockerignore file; should that dockerignore file be used, or ignored?

If we use it, how can one override this ignore-file (i.e. use that context, but don't use the .dockerignore file)?

@tonistiigi
Copy link
Member

@thaJeztah Why do you want to complicate this proposal with this extra dockerignore flags logic, it seems irrelevant to the main proposal. If .dockerignore is defined per context it should already solve almost all the cases. To keep the Dockerfile portability guarantees and avoid the mess where a Dockerfile can only be built on a specific host environment, the dockerignore either needs to be in a fixed relation to a context(clear solution for this proposal) or Dockerfile (that I suggested in some other proposal) or the dockerignore needs to be defined in the Dockerfile itself.

@thaJeztah
Copy link
Member Author

@tonistiigi hm. actually had my wires crossed there; I thought that .dockerignore and "context" are decoupled from each other, but it's only the Dockerfile that's separate.

Yes, so in that case, there should beno need to manually specify the .dockerignore for each context; if a .dockerignore is present at the root of a context, that .dockerignore should be used.

@joeyhub
Copy link

joeyhub commented Jun 14, 2019

COPY . /build/service1/src/

Rather than named contexts would it not be simpler to have src:dest?

dest could still work like name but would be read in the file system...

IE... --context a --context b would be a shorthand for --context a:a --context b:b and would allow you to ADD a/file and ADD b/file? Might be nice to also have --ignore a/other --ignore b/other with some thought for allow/deny or deny/allow. Many people will have arbitrary existing file structures they want to cherry pick from for performance.

If the dockerfile does a two pass though, can't it determine what's actually needed from the context itself? This seems to be a limitation with how the build command communicates with the daemon. It should be possible otherwise to first interpolate all the variables for add/copy to static paths then only send those?

This is the real problem a lot of people face. For a lot of people just a detached/additional ignore files would do a lot of good (having a single ignore file in one place makes it problematic for parallel builds).

@thaJeztah
Copy link
Member Author

If the dockerfile does a two pass though, can't it determine what's actually needed from the context itself? This seems to be a limitation with how the build command communicates with the daemon. It should be possible otherwise to first interpolate all the variables for add/copy to static paths then only send those?

Try building with DOCKER_BUILDKIT=1; when building with buildkit enabled (the next generation builder), docker uses an interactive session, and only sends what's needed

@joeyhub
Copy link

joeyhub commented Jun 14, 2019

Ah nice, I guess that's sending things on demand. It's a bit of a problem with linux that --help commands and man pages often neglect ENV options. Would be nice to set a trend against that with docker. I assume with interactive you mean basically the daemon requests resources from the client as needed. Though it might also be nice if the system had the capacity to know if it's local and can access things directly.

@thaJeztah
Copy link
Member Author

Yes, the man pages could use some TLC (docker/cli#923)

I assume with interactive you mean basically the daemon requests resources from the client as needed.

Correct; see #32677 for the first implementation

Though it might also be nice if the system had the capacity to know if it's local and can access things directly.

Builds are always "remote" when using docker build (due to docker using a client/daemon design). This is also by design; the build will always use a copy of the files used, so that it builds in isolation, with no direct access to files on your host.

@julichan
Copy link

Any news on this issue ?
I'd like to add a +1 for this.

@asolopovas
Copy link

+1

1 similar comment
@AlonMiz
Copy link

AlonMiz commented Jun 25, 2020

+1

@JohannSig
Copy link

+1

1 similar comment
@DenisKudelin
Copy link

+1

@demonti
Copy link

demonti commented Jul 16, 2021

This sounds like a nice solution for a long term problem. It would be nice to have a sign from the developers whether this solution (or any other equivalent) is still considered. Thanks.

@neersighted
Copy link
Member

This is implemented now in the BuildKit builder, and can be used in docker build (23+)/docker buildx build, Compose, and docker buildx bake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests