Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile COPY with file globs will copy files from subdirectories to the destination directory #15858

Closed
jfchevrette opened this issue Aug 26, 2015 · 77 comments

Comments

@jfchevrette
Copy link

Description of problem:
When using COPY in a Dockerfile and using globs to copy files & folders, docker will (sometimes?) also copy files from subfolders to the destination folder.

$ docker version
Client:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 19:47:52 UTC 2015
 OS/Arch:      darwin/amd64

Server:
 Version:      1.8.0
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   0d03096
 Built:        Tue Aug 11 17:17:40 UTC 2015
 OS/Arch:      linux/amd64

$ docker info
Containers: 26
Images: 152
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 204
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.0.9-boot2docker
Operating System: Boot2Docker 1.8.0 (TCL 6.3); master : 7f12e95 - Tue Aug 11 17:55:16 UTC 2015
CPUs: 4
Total Memory: 3.858 GiB
Name: dev
ID: 7EON:IEHP:Z5QW:KG4Z:PG5J:DV4W:77S4:MJPX:2C5P:Z5UY:O22A:SYNK
Debug mode (server): true
File Descriptors: 42
Goroutines: 95
System Time: 2015-08-26T17:17:34.772268259Z
EventsListeners: 1
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Username: jfchevrette
Registry: https://index.docker.io/v1/
Labels:
 provider=vmwarefusion

$ uname -a
Darwin cerberus.local 14.5.0 Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64

Environment details:
Local setup on OSX /w boot2docker built with docker-machine

How to Reproduce:

Context

$ tree
.
├── Dockerfile
└── files
    ├── dir
    │   ├── dirfile1
    │   ├── dirfile2
    │   └── dirfile3
    ├── file1
    ├── file2
    └── file3

Dockerfile

FROM busybox

RUN mkdir /test
COPY files/* /test/

Actual Results

$ docker run -it copybug ls -1 /test/
dirfile1
dirfile2
dirfile3
file1
file2
file3

Expected Results
The resulting image should have the same directory structure from the context

@jfchevrette
Copy link
Author

Updated original message with output from docker info and uname -a and reformatted it to be according to the issue reporting template.

@jrabbit
Copy link
Contributor

jrabbit commented Sep 1, 2015

I've had this on 1.6.2 and 1.8
https://gist.github.com/jrabbit/e4f864ca1664ec0dd288 second level directories are treated as first level ones should be for some reason?

for those googling: if you're having issues with COPY * /src try COPY / /src

@duglin
Copy link
Contributor

duglin commented Sep 1, 2015

@jfchevrette I think I know why this is happening.
You have COPY files/* /test/ which expands to COPY files/dir files/file1 files/file2 files/file /test/. If you split this up into individual COPY commands (e.g. COPY files/dir /test/) you'll see that (for better or worse) COPY will copy the contents of each arg dir into the destination dir. Not the arg dir itself, but the contents. If you added a 3rd level of dirs I bet those will stick around.

I'm not thrilled with that fact that COPY doesn't preserve the top-level dir but its been that way for a while now.

You can try to make this less painful by copying one level higher in the src tree, if possible.

@jfchevrette
Copy link
Author

I'm pretty confident that @duglin in right and it could be very risky to change that behavior. many dockerfiles may break or simply copy inuntended stuff.

However I'd argue that for the long run it would be better if COPY was following the way tools such as cp or rsync handle globs & trailing slashes on folders. It's definitely not expected for COPY to copy files from a subfolder matching dir/* into the destination IMO

@duglin
Copy link
Contributor

duglin commented Sep 1, 2015

@jfchevrette yep - first chance we get we should "fix" this.
Closing it for now...

@tugberkugurlu
Copy link

@duglin so, closing means it will not get fixed?

@duglin
Copy link
Contributor

duglin commented Feb 27, 2016

@tugberkugurlu yup, at least for now. There's work underway to redo the entire build infrastructure and when we do that is when we can make COPY (or its new equivalent) act the way it should.

@tugberkugurlu
Copy link

@duglin thanks. Is it possible to keep this issue open and update the status here? Or is there any other issue for this that I can subscribe to?

@duglin
Copy link
Contributor

duglin commented Feb 27, 2016

@tugberkugurlu I thought we had an issue for "client-side builder support" but I can't seem to find it. So all we may have is what the ROADMAP ( https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax ) says.

As for keeping the issue open, I don't think we can do that. The general rule that Docker has been following is to close any issue that isn't actionable right away. Issues for future work are typically closed and then reopened once the state of things change such that some action (PR) can be taken for the issue.

@deric
Copy link

deric commented Nov 5, 2016

@duglin This is very serious issue, you shouldn't just close it because the problem was introduced in 0.1 release. It would be more appropriate to target this for 2.0 release (milestones are on github too).

I guess most people use:

COPY . /app

and blacklist all other folders in .gitignore or have single level directory structure and use COPY which actually has mv semantics:

COPY src /myapp

It's quite hard for me to imagine that someone would actually use COPY for flattening directory structure. The other workaround for this is using tar -cf .. & ADD tarfile.tar.gz. Changing at least this would be really helpful. The other thing is respecting slashes in directory names COPY src /src vs COPY src/ /src (which are currently completely ignored).

@tjwebb
Copy link

tjwebb commented Jan 14, 2017

duglin closed this on Sep 1, 2015

@duglin This is a ridiculous and infuriating issue and should not be closed. The COPY command behaves specifically in disagreement with the documented usage and examples.

@thaJeztah
Copy link
Member

thaJeztah commented Jan 14, 2017

@tjwebb there's still an open issue #29211. This can only be looked into if there's a way to fix this that's fully backward compatible. We're open to suggestions if you have a proposal how this could be implemented (if you do, feel free to write this up, and open a proposal, linking to this issue). Note that there's already a difference between (for example), OS X, and Linux in the way cp is handled;

mkdir -p repro-15858 \
  && cd repro-15858 \
  && mkdir -p source/dir1 source/dir2 \
  && touch source/file1 source/dir1/dir1-file1 \
  && mkdir -p target1 target2 target3 target4 target5 target6

cp -r source target1 \
&& cp -r source/ target2 \
&& cp -r source/ target3/ \
&& cp -r source/* target4/ \
&& cp -r source/dir* target5/ \
&& cp -r source/dir*/ target6/ \
&& tree

OS X:

.
├── source
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target1
│   └── source
│       ├── dir1
│       │   └── dir1-file1
│       ├── dir2
│       └── file1
├── target2
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target3
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target4
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target5
│   ├── dir1
│   │   └── dir1-file1
│   └── dir2
└── target6
    └── dir1-file1

20 directories, 12 files

On Ubuntu (/bin/sh)

.
|-- source
|   |-- dir1
|   |   `-- dir1-file1
|   |-- dir2
|   `-- file1
|-- target1
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target2
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target3
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target4
|   |-- dir1
|   |   `-- dir1-file1
|   |-- dir2
|   `-- file1
|-- target5
|   |-- dir1
|   |   `-- dir1-file1
|   `-- dir2
`-- target6
    |-- dir1
    |   `-- dir1-file1
    `-- dir2

24 directories, 12 files
diff --git a/macos.txt b/ubuntu.txt
index 188d2c3..d776f19 100644
--- a/macos.txt
+++ b/ubuntu.txt
@@ -11,15 +11,17 @@
 │       ├── dir2
 │       └── file1
 ├── target2
-│   ├── dir1
-│   │   └── dir1-file1
-│   ├── dir2
-│   └── file1
+│   └── source
+│       ├── dir1
+│       │   └── dir1-file1
+│       ├── dir2
+│       └── file1
 ├── target3
-│   ├── dir1
-│   │   └── dir1-file1
-│   ├── dir2
-│   └── file1
+│   └── source
+│       ├── dir1
+│       │   └── dir1-file1
+│       ├── dir2
+│       └── file1
 ├── target4
 │   ├── dir1
 │   │   └── dir1-file1
@@ -30,6 +32,8 @@
 │   │   └── dir1-file1
 │   └── dir2
 └── target6
-    └── dir1-file1
+    ├── dir1
+    │   └── dir1-file1
+    └── dir2
 
-20 directories, 12 files
+24 directories, 12 files

@AshleyAitken
Copy link

Make a new command CP and get it right this time please.

@nickjbyrne
Copy link

I would echo the above, this must have wasted countless development hours, its extremely un-intuitive.

@anorth2
Copy link

anorth2 commented Jun 8, 2017

+1 from me. This is really stupid behavior and could easily be remedied by just adding a CP command that performs how COPY should have.

"Backwards compatibility" is a cop out

@snobu
Copy link

snobu commented Sep 19, 2017

The TL;DR version:

Don't use COPY * /app, it doesn't do what you'd expect it to do.
Use COPY . /app instead to preserve the directory tree.

@adsl99801
Copy link

COPY only able copy it's subfolder .

@divmgl
Copy link

divmgl commented Mar 3, 2018

Just spent countless hours on this... Why does this even work this way?

@gaui
Copy link

gaui commented Mar 25, 2018

I'm using Paket and want to copy the following in the right structure:

.
├── .paket/
│   ├── paket.exe
│   ├── paket.bootstrapper.exe
├── paket.dependencies
├── paket.lock
├── projectN/

And by doing COPY *paket* ./ it results in this inside the container:

.
├── paket.dependencies
├── paket.lock

How about adding a --glob or --recursive flag for COPY and ADD ?

@laurencefass
Copy link

COPY . /destination preserves sub-folders.

@fnagel
Copy link

fnagel commented Apr 13, 2018

Three years and this is still an issue :-/

@BaibhavVishal123
Copy link

Can we get an ETA, when this will be fixed

@laurencefass
Copy link

not an issue...
from above...
COPY . /destination preserves sub-folders.

@snobu
Copy link

snobu commented Apr 18, 2018

True, no longer an issue after you fume for half a day and end up here. Sure :)
Let's be constructive,

image

We really need a new CP command or a --recursive flag to COPY so backwards compatibility is preserved.

Top points if we also show a warning on image build, like:
Directory structure not preserved with COPY *, use CP or COPY . More here <link>. if we detect possible misuse.

@intellix
Copy link

intellix commented Apr 23, 2018

I'm looking for this for copying across nested lerna package.json files in subdirectories to better utilise npm install cache to only trigger when dependencies change. Currently all files changed cause dependencies to install again.

Something like this would be great:

COPY ["package.json", "packages/*/package.json", "/app/"]

@thaJeztah
Copy link
Member

@automaton82 @rjgotten looks like this proposal may address that use-case; #35639 (also see moby/buildkit#396, and this comment, which describes things more in-depth; #38710 (comment)

@automaton82
Copy link

@rjgotten

The band-aid workaround you proposed worked very well, so I thank you for that. It solved the problem for me. I only had to modify the find slightly to accommodate the fact that we have a complicated build structure with other project types and files.

It is now saving 100+ seconds of NuGet build refresh.

@thaJeztah

Adding a new flag to the COPY command is certainly one way to solve this problem. As long as it supports wildcards in the filename then yes it would have solved my case as I am looking to only copy files of certain extensions while keeping their relative directory tree intact.

@rjgotten
Copy link

rjgotten commented Dec 7, 2020

@thaJeztah
Yes. A --parents flag to the COPY command would fix this perfectly, without danger of changing the semantics of existing dockerfiles, which has been touted as the reason to not touch on any of this for years on end. "Just use a flag" was in fact mentioned before, but nobody did anything with it. 'Buildkit will solve it anyway' being a very common tendency there.

Buildkit of course is -- with no offense meant to the parties involved; it's a complete rework after all -- dragging its feet. And meanwhile many users continue to suffer the inadequacies of the COPY command. That meter's been running for years. Literally.

I'm very happy to see something that resembles a formal proposal somehow coming together and hopefully something coming of it.

@benmccallum
Copy link

It was also fun to learn the other day (according to official docs) that the .dockerignore file similarly uses Go's file finding/matching/whatever-you'd-call-it. I mean I get it, all this is written in Go... but when decent globbing has been a staple in so many other places (node-land, gulp, even the normal .gitignore!), it just seems crazy they went this way :P

@soroshsabz
Copy link

@duglin Please reopen this issue

thanks :)

@velcrine
Copy link

If this is not fixed, why closed? It will save a lot of juice if gets fixed.

@Nielio
Copy link

Nielio commented Jul 23, 2021

How about this syntax to avoid breaking changes?

COPY modules/*/package*.json ./modules/*

@HexMox
Copy link

HexMox commented Sep 17, 2021

Any Updates?

PeterlitsZo added a commit to PeterlitsZo/ACMHomepage that referenced this issue Mar 10, 2022
We did:

* Add Dockerfile in `back/` folder for docker container `backend`. Now it will
  build the node_modules in image. It will be helpful for those who do not have
  tool yarn and reduce the difficulty of running.
* Move source files from `back/` to `back/src`. Because Dockerfile's `COPY`
  cannot just copy directory. [More info][github-moby-dockerfile-stupid-copy].
  And that's why we also change some files with new path to those source files.
* Add `depends_on` keys in `docker-compose.yml`

[github-moby-dockerfile-stupid-copy]: moby/moby#15858
RileyYe pushed a commit to ACMHomepage/ACMHomepage-prev that referenced this issue Mar 11, 2022
We did:

* Add Dockerfile in `back/` folder for docker container `backend`. Now it will
  build the node_modules in image. It will be helpful for those who do not have
  tool yarn and reduce the difficulty of running.
* Move source files from `back/` to `back/src`. Because Dockerfile's `COPY`
  cannot just copy directory. [More info][github-moby-dockerfile-stupid-copy].
  And that's why we also change some files with new path to those source files.
* Add `depends_on` keys in `docker-compose.yml`

[github-moby-dockerfile-stupid-copy]: moby/moby#15858
@aradalvand
Copy link

aradalvand commented Apr 3, 2022

It's just shameful at this point, it's 2022 and people still have to deal with this.
At least reopen the issue or just tell us why it's not feasible to add a new syntax for this?!

@erikeckhardt
Copy link

This would make my Dockerfile SO much simpler. Seven COPY commands in my current file could be just one.

Just create a new COPY_WITH_PATH or something?

@tinkerborg
Copy link

tinkerborg commented Apr 21, 2022

Another plea to please, please just add --parents support to COPY. I was able to find a workaround by combining the --mount based solutions offered above with a COPY to /dev/null in order to invalidate cache when one of these files changes. This works, but is counterintuitive, hacky, and will require a lot of explanation whenever a teammate encounters it.

# this will recursively copy all package.json, preserving structure.
# initial COPY to /dev/null is for layer cache invalidation when a package.json changes
COPY package.json */**/package.json /dev/null
RUN --mount=target=/context cd /context && find . -name package.json | xargs cp --parents -t /app/

This seems like it would be a trivially easy feature to implement.

EDIT: Sadly the --mount arg causes a cache bust when using inline manifests w/ --cache-from. So it looks like I'm back to an explicit COPY for every package.json in the repo in order to benefit from caching in our CI environment. I'll use a hook in pre-commit for this, but it's really unfortunate to have to do things this way when a --parents flag would solve this simply and elegantly.

@serhii-kovalchuk-ideals

Omg, what is the problem to just add new flag to the COPY command? Is the architecture of buildkit so weird, that it is so hard to just extend functionality of the COPY command or even create a new command as it was proposed in this thread? Lots of developers write tons of example for you why this is so wanted feature, but you ...just close the ticket? This is ridiculous!

@vorishirne
Copy link

This happens most of the time when a product becomes one in the market.

@vorishirne
Copy link

Same happened with postman not supporting GRPC for years. Just no other so much developed product.

@adj123
Copy link

adj123 commented Nov 9, 2022

For those .Net folks still suffering due to the lack of this feature, another workaround:

If you have CPVM enabled and are not enabling project-specific version overrides, and you know the target frameworks your app uses, then you can just copy over your Directory.Packages.props and do something like

RUN pwsh create-restore-project.ps1 -Frameworks 'net6.0;net452' && dotnet restore

where create-restore-project.ps1 is a Powershell script copied over too and looks like

param($frameworks)
@"
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFrameworks>$frameworks</TargetFrameworks>
  </PropertyGroup>
  <ItemGroup>
$(((gc Directory.Packages.props) |% { if ($_ -match '<PackageVersion Include="(.+?)"') { "    <PackageReference Include=`"$($matches[1])`" />" } }) -Join "`n")
  </ItemGroup>
</Project>
"@ > restore-project.csproj

This avoids the need to depend on repository structure and avoids slowly copying over the whole repository, and avoids this buildkit nightmare, but is not a silver bullet as it still downloads more packages than the app needs in a monorepo and only works with non-overridable CPVM. I imagine Paket users also have an analogous workaround, not needing the dummy project created.

@acunap
Copy link

acunap commented Feb 3, 2023

I have a netcore multi-stage build and have this same problem. I can't preserve directory structure when restoring our .sln and all nested .csproj NuGet packages. We have >50 csproj's as part of the solution and it takes >100 seconds to restore them all again despite nothing but source files changes. Ideally if I did:

COPY ./**/*.csproj .

I end up with a flat copy of all the .csproj and not the hierarchy structure which is what I require. Since there's no way to do this I'm left to scan these comments and other places to try and find way to do this and it is really unfortunate. Copying each of the .csproj is not a realistic solution when there are so many and they can change (requiring someone to remember to go and update the Dockerfile).

@automaton82

Hi, I have a solution using bash in one line for you and also I hope you all can adapt the code to fit your needs:

FFROM mcr.microsoft.com/dotnet/aspnet:6.0 AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM mcr.microsoft.com/dotnet/sdk:6.0 AS build
WORKDIR /src
COPY src/**/*.csproj .
RUN for f in *.csproj; do mkdir "${f%.csproj}" && cp "$f" "${f%.csproj}" && rm "$f"; done <-- this line
RUN dotnet restore "API/API.csproj"
COPY src/ .
WORKDIR "/src/API"
RUN dotnet build "API.csproj" -c Release

FROM build AS publish
RUN dotnet publish "API.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "API.dll"]

API is my entrypoint project, change it if you need.

@warning-explosive
Copy link

warning-explosive commented Feb 8, 2023

I have a netcore multi-stage build and have this same problem. I can't preserve directory structure when restoring our .sln and all nested .csproj NuGet packages. We have >50 csproj's as part of the solution and it takes >100 seconds to restore them all again despite nothing but source files changes. Ideally if I did:
COPY ./**/*.csproj .
I end up with a flat copy of all the .csproj and not the hierarchy structure which is what I require. Since there's no way to do this I'm left to scan these comments and other places to try and find way to do this and it is really unfortunate. Copying each of the .csproj is not a realistic solution when there are so many and they can change (requiring someone to remember to go and update the Dockerfile).

@automaton82

Hi, I have a solution using bash in one line for you and also I hope you all can adapt the code to fit your needs:

FFROM mcr.microsoft.com/dotnet/aspnet:6.0 AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM mcr.microsoft.com/dotnet/sdk:6.0 AS build
WORKDIR /src
COPY src/**/*.csproj .
RUN for f in *.csproj; do mkdir "${f%.csproj}" && cp "$f" "${f%.csproj}" && rm "$f"; done <-- this line
RUN dotnet restore "API/API.csproj"
COPY src/ .
WORKDIR "/src/API"
RUN dotnet build "API.csproj" -c Release

FROM build AS publish
RUN dotnet publish "API.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "API.dll"]

API is my entrypoint project, change it if you need.

Thanks guys! It helped me a lot and I finally wrote multi layer build with cached restore results. I developed the idea of restoring project structure inside a temporary docker container. Solution contains *.sln file which already knows about project structure. So with little help of bash we can restore folders hierarchy. Here some samples:

  1. Copy your project files at an arbitrary depth
...
COPY ["./*.csproj", "./*/*.csproj", "./*/*/*.csproj", "./*.sln", "./*.sh", "./NuGet.config", "./"]
RUN chmod +x restore_project_structure.sh
RUN ./restore_project_structure.sh
RUN dotnet restore --packages ./packages --no-cache -v:minimal
...
  1. restore_project_structure.sh - don't forget replace windows backslashes!
#!/bin/sh
for file in $(find . -name '*.csproj' -exec basename {} \;)
do
    dest=$(egrep -o ", \"(.*${file})" put_here_name_of_your_sln_file | cut -d \" -f 2)
    dest=${dest//\\//}
    mkdir -p `dirname ${dest}`
    mv "${file}" "${dest}"
done

@benmccallum
Copy link

benmccallum commented Mar 3, 2023

@warning-explosive , nice solution. Essentially what my global tool (dotnet-references) does (docs), but without having to install a global tool :)

Can you parameterize that script to accept the sln file so that this could be used across multiple solutions?

And what do you mean by "don't forget replace windows backslashes!"? Are you replacing all \ with / in your solution file? Can we update the script to handle both? I feel like adding a new csproj to a sln file will add with backslashes by default on windows.

@warning-explosive
Copy link

@warning-explosive , nice solution. Essentially what my global tool (dotnet-references) does (docs), but without having to install a global tool :)
Can you parameterize that script to accept the sln file so that this could be used across multiple solutions?
And what do you mean by "don't forget replace windows backslashes!"? Are you replacing all \ with / in your solution file? Can we update the script to handle both? I feel like adding a new csproj to a sln file will add with backslashes by default on windows.

About windows backslashes, you got it right and the bash script above actually replaces them. I've just wanted to emphasize this possible compatibility issue between platforms.

Parametrization with solution file name also possible, but I'm not going to do this soon.

@BuriedStPatrick
Copy link

While it's possible to solve the .csproj problem for .NET projects, it's actually impossible to solve the problem with packages.lock.json files when using <RestorePackagesWithLockFile>true</RestorePackagesWithLockFile>, unless some parent structure is supported in the Dockerfile. Given a basic .NET solution structure:

.
├── Dockerfile
├── MySolution.sln
├── Project1
│  ├── Project1.csproj
│  ├── packages.lock.json
│  └── Program.cs
├── Project2
│  ├── Project2.csproj
│  ├── packages.lock.json
│  └── Program.cs

Since the file names are identical, you have to copy the packages.lock.json files over manually with one COPY command for each file:

COPY ./Project1/packages.lock.json ./Project1/
COPY ./Project2/packages.lock.json ./Project2/

I don't think think the design of the .NET project structure is inherently flawed. It seems quite reasonable to expect build tooling to be able to handle this scenario.

@arthur-leclerc
Copy link

See moby/buildkit#3001, should resolve this use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests