Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to `ADD files and then remove them in the same layer in a Dockerfile #12169

Closed
mingfang opened this issue Apr 8, 2015 · 21 comments
Closed
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@mingfang
Copy link

mingfang commented Apr 8, 2015

A common use case when writing Dockerfiles is to ADD source code then followed by compiling and then packaging the application. Currently the ADD results in its own layer, without the ability to remove it.
Ideally I would be able to do this in a Dockerfile

ADD src /src && \
        cd /src && \
        make install && \
        rm -rf /src

This way the src will never end up in a layer.

@phemmer
Copy link
Contributor

phemmer commented Apr 8, 2015

There are numerous proposals which can accomplish the end goal:

#10310 - private volumes injected during build

#332 - squash multiple layers

#6906 - strip layers

@mingfang
Copy link
Author

mingfang commented Apr 8, 2015

Thanks for the links.
I agree that all of these will allow me to achieve the end goal of not having to include source in the image.
However I think my proposal is a bit more direct towards this goal.
All the other proposals require post processing to achieve this.

@xiaods
Copy link
Contributor

xiaods commented Apr 8, 2015

yes, thanks for your share.

@thaJeztah thaJeztah added area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Apr 8, 2015
@arturluizbr
Copy link

Maybe some command like these.

Simple mounting:

USE file_path / 

Multiples files in one container dir:

USE file1 file2 file3 /

Each file in their own container path:

USE file1:/start file2:/setup file3:/clean

Then USE won't create a new layer, but mount the volume in next build steps.

@mingfang
Copy link
Author

That's a potential solution.
Maybe a MOUNT command is even more to the point.

@dreamcat4
Copy link

We are getting --flags now. So a ADD or COPY --no-cache modifier flag might work as a way to avoid just that 1 layer being committed (until the end of the subsequent RUN cmd).

@mingfang
Copy link
Author

mingfang commented May 5, 2015

That works too.
I would however recommend —no-commit to avoid confusion.

On May 5, 2015, at 11:22 AM, Dreamcat4 notifications@github.com wrote:

We are getting --flags now. So a ADD or COPY --no-cache modifier flag might work as a way to avoid just that 1 layer being committed (until the end of the subsequent RUN cmd).


Reply to this email directly or view it on GitHub #12169 (comment).

@jessfraz
Copy link
Contributor

Hello!
We are no longer accepting patches to the Dockerfile syntax as you can read about here: https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax

Mainly:

Allowing the Builder to be implemented as a separate utility consuming the Engine's API will open the door for many possibilities, such as offering alternate syntaxes or DSL for existing languages without cluttering the Engine's codebase

Then from there, patches/features like this can be re-thought. Hope you can understand.

@redbaron
Copy link
Contributor

redbaron commented Jun 7, 2017

@jessfraz interesting that multiple FROM support in a single Dockerfile was just added, yet this very needed feature was shut down

@redbaron
Copy link
Contributor

redbaron commented Jun 7, 2017

For those who is interested, seems like https://github.com/grammarly/rocker provides needed features

@edhemphill
Copy link

edhemphill commented Dec 16, 2017

The fact that you don't allow moving a file and removing it, without leaving residue open in a in-between layer is a HUGE security problem. There are tons of docker containers out there in the commercial world which probably have rsa keys in them for github. Seriously guys.

@edhemphill
Copy link

edhemphill commented Dec 16, 2017

OK, here is how to build your container, with private Github repos, and do it without leaving ssh keys in a intermediate container:

Install Rocker - which will replace your docker build process. (see above link from @redbaron )

If your old Dockerfile was something like this:

FROM node:carbon
WORKDIR /usr/src/app
COPY ./npm-install.sh .
COPY ./package.json .
RUN ./npm-install.sh
COPY . .
ENTRYPOINT [ "node", "index.js" ]
EXPOSE 8080

Your Rockerfile will be:

FROM node:carbon
WORKDIR /usr/src/app-src
COPY . .
MOUNT /usr/src/app-src/node_modules
ATTACH ["./npm-install.sh"] 
RUN cp -R /usr/src/app-src /usr/src/app
WORKDIR /usr/src/app
ENTRYPOINT [ "node", "index.js" ]
EXPOSE 8080

(The reason for the ./ syntax chain is a rocker issue) grammarly/rocker#171
MOUNT lets you mount a volume container between build steps. The ATTACH - ok this lets you run the ./npm-install.sh script in an interactive shell, with the results still getting committed to the layer.

My npm-install.sh script is below, and I have private github repo references in my package.json. Also npm itself has private repo issues, so I first do a git ls-remote to login. The git credential cache is an in-memory cache only. ATTACH is interactive - so you can just login the old fashioned way.

#!/bin/bash

# helper script to install npm modules including private repos
git config --global credential.helper cache
git ls-remote https://github.com/XYZ/abcd.git
npm install

Run rocker -->
rocker build --attach -f "Rockerfile"
No residue that I can tell. But please let me know if someone sees otherwise. Recommend a similar capability for docker build

@lpapp-polatis
Copy link

I completely agree that this can be considered as security flaw even if I disregard that the fact that the source code can also (significantly) increase the image size. Shame, really.

@flx42
Copy link
Contributor

flx42 commented May 2, 2018

For users that are still looking, my colleague created a project for passing secrets to docker build:
https://github.com/3xx0/donkey
There is a section at the end that explains how it works.

Note that if security is not a concern, but you just have a large file, you can achieve the same with socat(1):

UUID=$(uuidgen -r) ; socat -U ABSTRACT-LISTEN:$UUID FILE:tensorflow_gpu-1.8.0-cp27-none-linux_x86_64.whl &

docker build -t tensorflow:gpu -f Dockerfile.gpu --network=host --build-arg UUID=$UUID --shm-size 256M .

In the Dockerfile:

ARG UUID                                                                                                                                                                                      

RUN socat -u ABSTRACT-CONNECT:$UUID FILE:/dev/shm/tensorflow_gpu-1.8.0-cp27-none-linux_x86_64.whl,creat=1 && \                                                                                
    pip --no-cache-dir install /dev/shm/tensorflow_gpu-1.8.0-cp27-none-linux_x86_64.whl

@sbrl
Copy link

sbrl commented Jun 18, 2020

@jessfraz Not helpful. Providing a broken link to some documentation that doesn't exist and claiming that the Dockerfile syntax isn't going to change without a definitive source doesn't help anyone.

Using intermediate files during the build process of a Docker container is absolutely essential - and 1 way or another is_really_ needs to be supported. Without a clean way to clean up intermediate build files, it makes Dockerfiles practically useless.

Another proposal:

FROM someimage

GROUPSTART
ADD ./foo
RUN bar --baz
RUN rm ./foo /another/example
GROUPEND

(I'll edit / delete this comment once this issue is resolved)

@arturluizbr
Copy link

@sbrl you can reach the same goal using build stages

Feel free to check my lastest Dockerfiles at:

I use stages there to handle files before copying them to final image

@thaJeztah
Copy link
Member

With BuildKit enabled (DOCKER_BUILDKIT=1) and the "experimental" Dockerfile syntax it's possible to mount (parts of) the build context.

Doing so allows accessing files, without copying them to the image, for example;

# syntax=docker/dockerfile:experimental

FROM busybox

# local build-context is mounted at /tmp/src (read-only by default), but not copied to the image
RUN --mount=type=bind,src=.,target=/tmp/src cd /tmp/src && make install

@dreamcat4
Copy link

@thaJeztah Nice!

BTW I can see now that we can enable the DOCKER_BUILDKIT=1 flag in docker hub's automated build system. However does the current version of dockerhub actually also support yet the new Dockerfile syntax for doing the RUN --mount=type=... ?

Since that is also required. In order for this solution to work (and be truly universal, supported including the docker hub for open sharing etc.). Thank you.

@thaJeztah
Copy link
Member

does the current version of dockerhub actually also support yet the new Dockerfile syntax for doing the RUN --mount=type=... ?

The experimental syntax is a "front-end". BuildKit front-ends are cool 🤓; they're distributed as images (here's the "experimental" front-end), and allow defining your own file format if you don't like Dockerfiles or need additional features (see Mockerfile, or fun stuff such as making docker build build using buildpacks https://github.com/tonistiigi/buildkit-pack (that one was just a two-hour hack that Tonis did and not optimised, so produces large images, but it was fun).

Any version of docker that has buildkit support (docker 18.09 and with some limits, docker 18.06) can use those front-ends

@dreamcat4
Copy link

does the current version of dockerhub actually also support yet the new Dockerfile syntax for doing the RUN --mount=type=... ?

The experimental syntax is a "front-end". BuildKit front-ends are cool nerd_face; they're distributed as images (here's the "experimental" front-end), and allow defining your own file format if you don't like Dockerfiles or need additional features (see Mockerfile,

Thanks for mentioning those there. I can see mockerfile being useful for my own images. since it transforms apt-get and git clone into repeatable and clean dockerfile syntax. Which resembles (very much so) an Ansible playbook. Did not see anybody taking it a step further and integrating mockerfile into pulumi or terraform. Which I guess what is replacing ansible these days.

or fun stuff such as making docker build build using buildpacks https://github.com/tonistiigi/buildkit-pack (that one was just a two-hour hack that Tonis did and not optimised, so produces large images, but it was fun).

This is intersting too. However still early days it would seem. Due to the large image sizes. As noted here too:

https://gitlab.com/groups/gitlab-org/-/epics/2880

For that reason I wait longer for it to mature. Since that extra overhead of larger images is not possible for me. But hopefully wil be solved in time. And maybe if github also adopts them for CI at some point. Not just gitlab.

Any version of docker that has buildkit support (docker 18.09 and with some limits, docker 18.06) can use those front-ends

... Cool man. Digging further it seems (as of today) the hub is on 19.03... so it should work over there now! Although i have not tried myself. Just noted the docker version after triggering an automated build of an existing image on my dockerhub. Its reported at the top of the build log here:

KernelVersion: 4.4.0-1060-aws
Components: [{u'Version': u'19.03.8', u'Name': u'Engine', u'Details': {u'KernelVersion': u'4.4.0-1060-aws', u'Os': u'linux', u'BuildTime': u'2020-03-11T01:24:30.000000000+00:00', u'ApiVersion': u'1.40', u'MinAPIVersion': u'1.12', u'GitCommit': u'afacb8b7f0', u'Arch': u'amd64', u'Experimental': u'false', u'GoVersion': u'go1.12.17'}}, {u'Version': u'1.2.13', u'Name': u'containerd', u'Details': {u'GitCommit': u'7ad184331fa3e55e52b890ea95e65ba581ae3429'}}, {u'Version': u'1.0.0-rc10', u'Name': u'runc', u'Details': {u'GitCommit': u'dc9208a3303feef5b3839f4323d9beb36df0a9dd'}}, {u'Version': u'0.18.0', u'Name': u'docker-init', u'Details': {u'GitCommit': u'fec3683'}}]
Arch: amd64
BuildTime: 2020-03-11T01:24:30.000000000+00:00
ApiVersion: 1.40
Platform: {u'Name': u'Docker Engine - Community'}
Version: 19.03.8
MinAPIVersion: 1.12
GitCommit: afacb8b7f0

Look forwards to trying this feature out in the future! Thank you all.

@thaJeztah
Copy link
Member

... Cool man. Digging further it seems (as of today) the hub is on 19.03... so it should work over there now!

Yes, it's possible to enable BuildKit for your automated builds. It's not (yet) enabled by default, but you can enable it on your automated builds by setting the environment variable; https://docs.docker.com/docker-hub/builds/#build-images-with-buildkit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests