Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COPY vs RUN --mount type=bind #2893

Closed
DYefimov opened this issue Jun 3, 2022 · 6 comments
Closed

COPY vs RUN --mount type=bind #2893

DYefimov opened this issue Jun 3, 2022 · 6 comments

Comments

@DYefimov
Copy link
Contributor

DYefimov commented Jun 3, 2022

This is an explorative issue for now.
With the main goal to shed some light on the origins of the docker COPY working differently from the linux cp utility.
And possibly a fix for that in the future.

[UPDATE]: RFC for this issue: #2917

The problem

Given a microservice architecture, I'm interested in reusing dockerfile target for every service I have in my application.

Consider the following setup:

  1. ./services folder on the host with many .py files - one per service + several common (in other words, per target image)
  2. dockerfile with two stages - build_env, and service.template
  3. compose file that reuses the same service.template target for every service while feeding it with $SOURCES build-arg (e.g. "./services/common ./services/service1.*"

If you use COPY $SOURCES /app in the dockerfile, everything works just fine for individual files, but fails when you try to feed a directory along that space separated list of sources.
Digging though buildkit sources it appears that there is a flag CopyDirContentsOnly: true that expands the contents of any directory in my sources list, or just stripping the top-most directory. And that is not configurable.

Linux cp utility does not do that. And frankly that's a mystery why would one implement a generic copy some other way rather than using something that proved to be stable for tenths of years. While exploring the issue, I found several mentions on that, like "backward compability"... the usual staff. Can you, please, elaborate on the reasoning behind this implementation.

The workaround

While looking for a solution for the issue, I was exploring standard methods, and there are a couple.

RUN --mount=type=bind,source=services,target=/src mkdir -p /app && tar cf - $SOURCES | tar xf - --strip-components=1 -C /app
was looking really promising at first, but it cache-misses for every service on any file update in the ./services directory.

There are some other ways to achieve the desired functionality.
For example, outsourcing the bundling process for a service to some other entity outside of the dockerfile in question. Like pull from git during the build process (side note, personally I'm strongly against tightly-coupling layers of complexity like this).
Or, maybe docker in docker with some tricks inside...
Or, maybe docker in docker building buildkit from sources and applying a custom patch?

You see where it goes - wouldn't the simple COPY working the way it is "supposed to be" be much nicer?

The experiment

I've run some experiments and it seems trivial to add a config option for COPY without introducing a breaking change:

"Not really a patch" here

Just the barebone working example:

diff --git a/frontend/dockerfile/dockerfile2llb/convert.go b/frontend/dockerfile/dockerfile2llb/convert.go
index 2dddd4ea..865fc31a 100644
--- a/frontend/dockerfile/dockerfile2llb/convert.go
+++ b/frontend/dockerfile/dockerfile2llb/convert.go
@@ -647,6 +647,7 @@ func dispatch(d *dispatchState, cmd command, opt dispatchOpt) error {
                       chown:        c.Chown,
                       chmod:        c.Chmod,
                       link:         c.Link,
+                       linux_like:   false,
                       location:     c.Location(),
                       opt:          opt,
               })
@@ -692,6 +693,7 @@ func dispatch(d *dispatchState, cmd command, opt dispatchOpt) error {
                       chown:        c.Chown,
                       chmod:        c.Chmod,
                       link:         c.Link,
+                       linux_like:   c.LinuxLike,
                       location:     c.Location(),
                       opt:          opt,
               })
@@ -1035,7 +1037,7 @@ func dispatchCopy(d *dispatchState, cfg copyConfig) error {
                       opts := append([]llb.CopyOption{&llb.CopyInfo{
                               Mode:                mode,
                               FollowSymlinks:      true,
-                               CopyDirContentsOnly: true,
+                               CopyDirContentsOnly: cfg.linux_like == false,
                               AttemptUnpack:       cfg.isAddCommand,
                               CreateDestPath:      true,
                               AllowWildcard:       true,
@@ -1124,6 +1126,7 @@ type copyConfig struct {
       chown        string
       chmod        string
       link         bool
+       linux_like   bool
       location     []parser.Range
       opt          dispatchOpt
}
diff --git a/frontend/dockerfile/instructions/commands.go b/frontend/dockerfile/instructions/commands.go
index 48ebf183..5fb28b1d 100644
--- a/frontend/dockerfile/instructions/commands.go
+++ b/frontend/dockerfile/instructions/commands.go
@@ -251,6 +251,7 @@ type CopyCommand struct {
       Chown string
       Chmod string
       Link  bool
+       LinuxLike bool
}

// Expand variables
diff --git a/frontend/dockerfile/instructions/parse.go b/frontend/dockerfile/instructions/parse.go
index a04e9b9d..16e378bf 100644
--- a/frontend/dockerfile/instructions/parse.go
+++ b/frontend/dockerfile/instructions/parse.go
@@ -307,6 +307,7 @@ func parseCopy(req parseRequest) (*CopyCommand, error) {
       flFrom := req.flags.AddString("from", "")
       flChmod := req.flags.AddString("chmod", "")
       flLink := req.flags.AddBool("link", false)
+       flLinuxLike := req.flags.AddBool("linux-like-copy", false)
       if err := req.flags.Parse(); err != nil {
               return nil, err
       }
@@ -323,6 +324,7 @@ func parseCopy(req parseRequest) (*CopyCommand, error) {
               Chown:           flChown.Value,
               Chmod:           flChmod.Value,
               Link:            flLink.Value == "true",
+               LinuxLike:       flLinuxLike.Value == "true",
       }, nil
}

[Update] Small followup

For the reference, the issue #15858 is dating back to 2015 and still out there.
Didn't find any followup or any good proposal, rather then introducing a new CP command (well first ADD then COPY and now CP - bet on the next one)
Maybe COPY --linux-like-copy ... or COPY --dont_expand_dirs or something along those lines would be a good start?

And btw, at first I thought RUN mount=type=bind should work the way proposed in #2821, but apparently it is not that smart.

@tonistiigi
Copy link
Member

While exploring the issue, I found several mentions on that, like "backward compability"... the usual staff. Can you, please, elaborate on the reasoning behind this implementation.

No idea. This was the initial implementation somewhere in 2013-2014. Best guess would be that someone found that the UNIX rules were too confusing to follow. Eg. the result of the cp depends on if the source path ends with / or . and if the destination directory exists or not. We are stuck with it as changing it would break existing Dockerfiles.

@DYefimov
Copy link
Contributor Author

DYefimov commented Jun 3, 2022

But then why the CopyDirContentsOnly: true flag is hardcoded out there? Setting it to false changes the behavior the desired way at first glance as proved by the experiment.

Why do you think @tonistiigi it would break existing Dockerfiles?
We just add a flag to the upcoming dockerfile frontend: COPY --behave_the_right_way, defaulted to false - meaning we do not introduce any breaking change, until you upgrade your #syntax= and intentionally set that flag to true.

In the meantime, let me dive deeper into the code, explore how CopyDirContentsOnly is implemented inside, compare it to the cp and look for corner cases... Can't spot any obvious contradiction "why not" at this point.

@tonistiigi
Copy link
Member

Yes, a new flag with opt-in is an option. But we can't change the default behavior to match cp.

@DYefimov
Copy link
Contributor Author

Stumbled into another more annoying (to fix) issue.

# syntax=dyefimov/dockerfile:2893COPY-preserve-top-dir-labs

ARG SOURCES=""

# this works just fine:
COPY --preserve-top-dir=true ./services/common ./services/api.* ./

# this fails:
COPY --preserve-top-dir=true $SOURCES ./

Distilled, and w/o any homebrewed changes:

# syntax=docker/dockerfile:1

ARG TEST_ARG="./services/common ./services/api.*"

# will fail:
COPY $TEST_ARG ./

...is bailing out with:

> [template.service 2/3] COPY ./services/common ./services/api.* ./:
------
lstat /var/lib/docker/tmp/buildkit-mount247673522/services/common ./services: no such file or directory

It happens right here. Lexer responsible for variable expansion is used after the instruction expansion, where SourcesAndDest is populated. As a result, we end up with only one source ("./services/common ./services/api.*") instead of two.

Guess that's a separate issue. But fixing just one of them makes no sense, and ruins the whole idea of being able to pass a list though a variable in this particular case. As the main idea here is to achieve more flexibility for templating dockerfiles in an Infrastructure-as-a-Code world.

Can't come up with a better idea on how to "not to break all existing dockerfiles" for this one, but add another opt-in flag.

Wait, what exactly are we trying to achieve here with that new flag? lets' try to name it:
--double-expand-vars - but we are not expanding vars the second time here
--sources-is-a-var - possible but very confusing

Can't come up with a clear name and implementation here. All of them seem dangerous or awkward.

A possible place for a fix is here and here

I'm confused. Should we make a new issue and PR? or that's already too much? or else?

And how to fix it? it looks like Sources split process rn is at the wrong place.

@DYefimov
Copy link
Contributor Author

DYefimov commented Jun 17, 2022

Digging through the SD Bible aka POSIX, it appears that parameter expansion happens before field splitting in POSIX compliant systems. Buildkit and buildah for that matter have an opposite order of operations. Thus limiting scripting capabilities of ADD/COPY commands.

@DYefimov
Copy link
Contributor Author

RFC for this issue: #2917

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants