New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COPY with excluded files is not possible #15771
Comments
See https://docs.docker.com/reference/builder/#dockerignore-file |
.dockerignore does not solve this issue. As I wrote, "the other part is subject to another COPY". |
So you want to conditionally copy based on some other copy? |
The context contains a lot of directories A1...A10 and a directory B. A1...A10 have one destination, B has another:
And this is awkward. |
What part of it is awkward? Listing them all individually? COPY A* /some/where/
COPY B /some/where/else/ Does this work? |
The names A1..A10, B were fake. Besides, There are a couple of options I admit, but I think that all of them are awkward. I mentioned three in my original posting. A fourth option is to rearrange my source code permanently so that A1..A10 are moved in a new directory A. I was hoping that this was not necessary because an additional nesting level is not something to wish for, and my current tools needed to special-case my dockerised projects then. (BTW, #6094 (following symlinks) would help in this case. But apparently, this is no option either.) |
With cp behaviour, I could ameliorate the situation by saying
It's still a mild maintenance problem because I would have to think of that line if I added an "A11" directory. But that would be acceptable. Besides, cp does not need excludes, because copying everything and removing the unwanted parts has almost no performance impact beyond the copying itself. With docker's COPY, it means wrongly invalidated cache every time B is changed, and bigger images. |
@bronger you can do:
just like you were suggesting. As for doing a |
But
copies the contents of the directories a b c d together, instead of creating the directories /some/where/{a,b,c,d}. It works like rsync with a slash appended to the src directory. Therefore, the four instructions
are needed. As for the cache ... if I say
then the cache is not used if e changes, although e is not effectively included into the operation. |
@bronger yep, sadly you're correct. I guess we could add a |
Fair enough. Then I will use a COPY+rm for the time being and add a FixMe comment. Thank you for your time! |
Just to 👍 this issue. I regularly regret that COPY doesn't mirror rsync's trailing slash semantics. It means you can't COPY multiple directories in a single statement, leading to layer proliferation. I regularly encounter a case where I want to copy many directories except for one (which will be copied later, because I want it to have different layer-invalidation effects), so Also, from
I guess it can't be changed now without breaking a lot of wild |
As a concrete example, let's say I have a directory looking like this:
I want something that looks like:
So that part1-N doesn't invalidate building of I have previously worked around this by putting part1-N in their own directory, so:
But I have also encountered this problem in projects that I am not at liberty to rearrange quite so easily. |
@Praller good example, we're facing the exact same issue. The main problem is that Go's filepath.Match doesn't allow much creativity compared to regular expressions (i.e. no anti pattern) |
I just came up with a somewhat crack-brained workaround for this. COPY can't exclude directories, but ADD can expand tgz. It's one extra build step: Then in your Dockerfile: That gives the full syntax of tar for including/excluding/whatever without gobs of wasted layers trying to include/exclude. |
@jason-kane This is nice trick, thanks for sharing. One small point: it looks like you can't add the |
+1 for this issue, I think it could be supported in the same way a lot of glob libraries support it: Here's a proposal to copy everything except COPY . /app -node_modules/ |
I come across the same problem as well, and it's kind of painful for me when my Java webapps is about 900MB but almost 80% of that is rarely changed. |
👍 |
I have the same problem although with |
Exact same issue here. I want to copy a git repo and exclude the .git directory. |
@oaxlin you could use the .dockerignore file for that. |
@antoineco are you sure that will work? It's been a while since I tried but I'm pretty sure |
@kkozmic-seek absolutely sure :) But the
|
Would really like this as well - to speed up build I could copy some folder in earlier parts of the build and then cache would help me out ... |
I'm not sure I understand what the use case is but wouldn't just touching the files to exclude before COPY solve the problem? RUN touch /app/node_modules
COPY . /app
RUN rm /app/node_modules AFAIK |
I don't like the suggestions to have to repeat everything inside the Looking at #33923, I don't think it's coincidental that what you want to exclude from the build context is exactly the same stuff you want to be excluded from COPY --use-dockerignore <source> <target> Or perhaps even something like this: COPY --use-ignorefile=".gitignore" <source> <target> Seeing how |
@asbjornu .gitignore and .dockerignore are not the same things at all. Especially for multistage builds where artifacts are generated on a build stage and not present in git at all, nevertheless should be included in the resulting image. |
I often want to copy outside of "docker build". In these cases, .dockerignore does nothing. We need an amendment to "docker cp" its the only sensible solution |
It's been 5 years that this issue was opened. In September 2020, I still want this. A lot of people have suggested hacks to workaround but almost all of them and others have requested |
If you want something, you need to work on it or find someone to work on it. |
First we need to know whether upstream wants this. |
After source code review, I think we should extend copy function here https://github.com/tonistiigi/fsutil/blob/master/copy/copy.go firstly. After that, we can extend backend.go in libsolver, and only after will be possilble extend AST and frontend of buildkit. UPDATE: yes, after extending copy.go everything will be close to moby/buildkit#1492 plus parsing list of excludes. |
Here #33923 (comment) I describe my workaround that use any .dockerignore in project. |
I just wanted to leave a comment here to say that any suggestion which involves fist doing The problem is that the moment that ignored file is modified, the next build becomes invalidated and has to discard its cache, which makes the build take longer. See the note at COPY:
So really any solution that will work has to strive to preserve the build cache, otherwise it's not worth it |
Looks like progress has been made in moby/buildkit#2082 but that selective COPY still isn't available in Docker. Looking forward to this feature. I have a situation now where I want to copy a large directory of data assets into an image in a step before copying in the rest of the project's assets. The data directory rarely changes, so I want to avoid copying it in and creating another large image layer every time a change happens in a small text file outside of that large directory. Currently this doesn't seem possible unless I exhaustively specify every asset outside of the data directory in the COPY directive or move all non-data assets into a subdirectory in the project. |
TLDR for people looking for a solution: I've read this whole thread and the most viable solution I've seen is this one where he tars the entire directory, then in the Dockerfile use the ADD instruction to get those files. Not ideal, but the best we probably have. #15771 (comment). I would suggest this as a better solution: DON'T use the Instead, pipe your list of files to ls -A -I <file-or-dir-to-ignore> -I node_modules -I '.git*' | xargs tar --mtime='1970-01-01' -zcf pkg.tgz
docker build ...... Here I used the In case it's unclear to someone, this is helpful for multi stage builds, if you don't have a multi stage build just ignore files in Ontopic now:Hoping 2022 is finally the year this gets implemented. It's funny to think that this problem originates because docker uses Golang's Match which was written with time guarantees as central aspect. Apparently, it's very hard to implement a regex with both time guarantees and negative lookaheads, otherwise something like |
There is another workaround using multistage builds that also plays nice with the cache. You would need to add a new # The scratchpad where we curate the files before the actual build:
FROM alpine AS scratchpad
WORKDIR /files
COPY maindir maindir
RUN rm -rf maindir/ex1 maindir/ex2 maindir/ex3...
# Your original Dockerfile goes here:
FROM ...
...
COPY --from=scratchpad /files/maindir maindir
RUN <stuff that depends on maindir...>
If you make a change to any of the excluded files it will only invalidate the cache for the |
I can't believe @matthewmueller proposal hasn't been implemented after all these years, wow. |
Can we just get one simple |
This is very annoying since now we can use |
This is something that needs to be implemented in BuildKit, and there's tracking issues that are still open;
That said, using a Here's a quick example; Create a "project" with some directories and files, some of which to be excluded mkdir cpexclude && cd cpexclude
mkdir -p exclude_me include_me/dir
touch one two three exclude_me/four exclude_me/five exclude_me/six include_me/foo include_me/dir/bar build Dockerfile, using a mount for the build-context, and use rsync instead of COPY (I'm using ) # syntax=docker/dockerfile:1
FROM alpine
WORKDIR /app
RUN apk add --no-cache rsync tree
RUN --mount=type=bind,target=/temp/src \
rsync -ar --progress /temp/src/ /app/ --exclude exclude_me Verify that the expected files are included, and the docker run --rm foo tree /app
/app
├── include_me
│ ├── dir
│ │ └── bar
│ └── foo
├── one
├── three
└── two But can be bind-mounted at runtime (if needed); docker run --rm --mount type=bind,src=$(pwd)/exclude_me,dst=/app/exclude_me foo tree /app
/app
├── exclude_me
│ ├── five
│ ├── four
│ └── six
├── include_me
│ ├── dir
│ │ └── bar
│ └── foo
├── one
├── three
└── two |
Thanks @thaJeztah |
It's not an acceptable solution changing a command causes a cache-miss and fresh layers are built. Which is a waste. |
It was an example for an alternative / workaround. As mentioned at the start of that comment, ultimately, this is something that requires changes in BuildKit, and there's tracking issues for that;
I'm locking the conversation on this ticket, as there's nothing actionable in this repository until this is supported in BuildKit. Once supported by BuildKit, and the BuildKit build-time dependency is updated in this repository, this ticket can be resolved. |
I need to COPY a part of a context directory to the container (the other part is subject to another COPY). Unfortunately, the current possibilities for this are suboptimal:
The text was updated successfully, but these errors were encountered: