New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COPY vs RUN --mount type=bind #2893
Comments
No idea. This was the initial implementation somewhere in 2013-2014. Best guess would be that someone found that the UNIX rules were too confusing to follow. Eg. the result of the cp depends on if the source path ends with |
But then why the CopyDirContentsOnly: true flag is hardcoded out there? Setting it to false changes the behavior the desired way at first glance as proved by the experiment. Why do you think @tonistiigi it would break existing Dockerfiles? In the meantime, let me dive deeper into the code, explore how CopyDirContentsOnly is implemented inside, compare it to the cp and look for corner cases... Can't spot any obvious contradiction "why not" at this point. |
Yes, a new flag with opt-in is an option. But we can't change the default behavior to match |
Stumbled into another more annoying (to fix) issue. # syntax=dyefimov/dockerfile:2893COPY-preserve-top-dir-labs
ARG SOURCES=""
# this works just fine:
COPY --preserve-top-dir=true ./services/common ./services/api.* ./
# this fails:
COPY --preserve-top-dir=true $SOURCES ./ Distilled, and w/o any homebrewed changes: # syntax=docker/dockerfile:1
ARG TEST_ARG="./services/common ./services/api.*"
# will fail:
COPY $TEST_ARG ./ ...is bailing out with: > [template.service 2/3] COPY ./services/common ./services/api.* ./:
------
lstat /var/lib/docker/tmp/buildkit-mount247673522/services/common ./services: no such file or directory It happens right here. Lexer responsible for variable expansion is used after the instruction expansion, where SourcesAndDest is populated. As a result, we end up with only one source ("./services/common ./services/api.*") instead of two. Guess that's a separate issue. But fixing just one of them makes no sense, and ruins the whole idea of being able to pass a list though a variable in this particular case. As the main idea here is to achieve more flexibility for templating dockerfiles in an Infrastructure-as-a-Code world. Can't come up with a better idea on how to "not to break all existing dockerfiles" for this one, but add another opt-in flag. Wait, what exactly are we trying to achieve here with that new flag? lets' try to name it: Can't come up with a clear name and implementation here. All of them seem dangerous or awkward. A possible place for a fix is here and here I'm confused. Should we make a new issue and PR? or that's already too much? or else? And how to fix it? it looks like Sources split process rn is at the wrong place. |
Digging through the SD Bible aka POSIX, it appears that parameter expansion happens before field splitting in POSIX compliant systems. Buildkit and buildah for that matter have an opposite order of operations. Thus limiting scripting capabilities of ADD/COPY commands. |
RFC for this issue: #2917 |
This is an explorative issue for now.With the main goal to shed some light on the origins of the docker COPY working differently from the linux cp utility.
And possibly a fix for that in the future.
[UPDATE]: RFC for this issue: #2917
The problem
Given a microservice architecture, I'm interested in reusing dockerfile target for every service I have in my application.
Consider the following setup:
If you use
COPY $SOURCES /app
in the dockerfile, everything works just fine for individual files, but fails when you try to feed a directory along that space separated list of sources.Digging though buildkit sources it appears that there is a flag
CopyDirContentsOnly: true
that expands the contents of any directory in my sources list, or just stripping the top-most directory. And that is not configurable.Linux cp utility does not do that. And frankly that's a mystery why would one implement a generic copy some other way rather than using something that proved to be stable for tenths of years. While exploring the issue, I found several mentions on that, like "backward compability"... the usual staff. Can you, please, elaborate on the reasoning behind this implementation.
The workaround
While looking for a solution for the issue, I was exploring standard methods, and there are a couple.
RUN --mount=type=bind,source=services,target=/src mkdir -p /app && tar cf - $SOURCES | tar xf - --strip-components=1 -C /app
was looking really promising at first, but it cache-misses for every service on any file update in the ./services directory.
There are some other ways to achieve the desired functionality.
For example, outsourcing the bundling process for a service to some other entity outside of the dockerfile in question. Like pull from git during the build process (side note, personally I'm strongly against tightly-coupling layers of complexity like this).
Or, maybe docker in docker with some tricks inside...
Or, maybe docker in docker building buildkit from sources and applying a custom patch?
You see where it goes - wouldn't the simple COPY working the way it is "supposed to be" be much nicer?
The experiment
I've run some experiments and it seems trivial to add a config option for COPY without introducing a breaking change:
"Not really a patch" here
Just the barebone working example:
[Update] Small followup
For the reference, the issue #15858 is dating back to 2015 and still out there.
Didn't find any followup or any good proposal, rather then introducing a new
CP
command (well firstADD
thenCOPY
and nowCP
- bet on the next one)Maybe
COPY --linux-like-copy ...
orCOPY --dont_expand_dirs
or something along those lines would be a good start?And btw, at first I thought RUN mount=type=bind should work the way proposed in #2821, but apparently it is not that smart.
The text was updated successfully, but these errors were encountered: