Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build secrets #33343

Closed
tonistiigi opened this issue May 22, 2017 · 38 comments
Closed

Build secrets #33343

tonistiigi opened this issue May 22, 2017 · 38 comments
Labels
area/builder kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@tonistiigi
Copy link
Member

In a maintainers meeting last Thursday, there was a discussion how to move forward with Build secrets.

The previous PR #30637 is closed atm but we need to make sure that the issue is still tracked. The issue was closed because of design issues(listed below) and possible changes/features coming with #32507. This issue is mainly for keeping the secrets discussion from blocking #32507.

Open questions:

Secret sources:

In #30637 secrets are sent from the client with the context tar. There were concerns if these should be loaded from swarm secrets instead. The use cases seem quite different but it does feel weird to have 2 secrets implementations. Also, the current "build-secrets" use cases are not as secure as swarm ones.

Sending secrets:

In #30637 secrets are injected into context tar on the client side and extracted on daemon before being used. With #32677 this could be done independently from context.

Dockerfile UI:

In #30637 user specifies target path for the secrets in cli command with --build-secret flag. All data exposed like this will become available in /run/secrets for every RUN operation. Normally it is the image/dockerfile author who knows where specific secrets were expected. SECRET Dockerfile command was one of the options considered. #32507 lets image author specify the mount path. With moby/swarmkit#2118 regular swarm secrets do not need to be in /run/secrets either. #32507 allows exposing mounts to a specific command that needs a secret, not to everything at once.

Build cache:

#30637 ignores build cache, #32507 uses it. It is probably more correct to ignore cache to avoid chosen plaintext attacks so #32507 would need to allow that.

Other solutions:

Most examples that show build secrets use it for SSH keys. There are other ways for exposing this specific feature. #32677 allows ssh forwarding(poc tonistiigi@a175773). By exposing git sources as build stages we could use any auth(ssh, oauth) for cloning git repos.

@ehazlett @thaJeztah @dnephin @cpuguy83 @diogomonica

@tonistiigi
Copy link
Member Author

My suggestion:

Build secrets can be exposed from docker build CLI with --build-secret foo=path. path could be filename from a client host. If the node is a swarm manager it could be swarm-secret:id. In case of swarm builder would use control API to first get access to secret data. This is bit complicated atm as secret values are redacted from control API and reading data requires creating a service. Otoh this removes the conflict between local and swarm secrets and doesn't require builder component to rely on swarm services(conflict with Moby's components split). Alternatively, path could just be inline data and curl's @ syntax could be used to refer to local file data.

Instead of injecting the data with the context they would be exposed by the client session #32677 when builder actually asks for them. This is very similar how exposing source files outside of working dir should work for the builder in the future. The only difference is that tmpfs would be used for secret data and it would not be cached.

In Dockerfile a RUN command can get access to this data by using --mount flag(#32507). The source for the mount should be the same name used on the client side(maybe with a secret prefix). If this prefix is used(or type is set to "secret") builder would also opt out of cache for this mount and not store the values in intermediate stages. The destination path is specified in by --mount in the Dockerfile.

@cpuguy83
Copy link
Member

If the node is a swarm manager it could be swarm-secret:id

Not a huge fan of overloading such a flag.
Likely users will also want to name these secrets vs using the path to the secret.

I think we should focus on what the APIs are for this before trying to work out the exact CLI details.

So basically, how does an API client inject secrets?
How does the system work out what secrets to inject when?
How does the user specify how to use a specific secret from the Dockerfile in a build step?

@mshappe
Copy link

mshappe commented Aug 23, 2017

I can't easily speak to your first two questions, but:

How does the user specify how to use a specific secret from the Dockerfile in a build step?

Simplicity is best, IMO. I would like to be able to say, for example:

SECRETS ~/.ssh:/root/.ssh

Doing so would imply the semantics that no trace of anything out of that directory should be left behind in the image (what a lot of people do manually with an rm and a --squash among other hacks) after build--that it is purely a build-time association.

@tonistiigi
Copy link
Member Author

So basically, how does an API client inject secrets?

I'd recommend the session added for --stream that allow client to set up custom handlers.

How does the system work out what secrets to inject when?

Daemon asks them from the client, client validates against cli parameters. (This is for local secrets, to access swarm secrets, daemon would need to query swarm).

How does the user specify how to use a specific secret from the Dockerfile in a build step?

In #33343 (comment) this is done with RUN --mount.

@OJezu
Copy link

OJezu commented Dec 14, 2017

On request from @cpuguy83 I will add my two cents:

Problem

  1. Lack of this feature is crippling, there is no way to e.g. pull dependencies from private repos in a way that wouldn't be panned by any self-respecting auditor. Therefore it is imperative something was done about it 2 years ago. For lack of a better option, now is the second best time.

  2. I think that the reason the matters are seeing no improvement, and all tries to resolve the pains are going nowhere is because of (what I think) are two of the basic foundations of docker:

    • don't let user do something stupid/dangerous
    • everything needed and providing for build must be contained in context

    However, the build secrets require being kept separate from files describing and creating the build environment. They must be injected from outside. Same goes to caching service/storage, but that gives more options for where in the "outside" it can be placed.

    Build secrets require breaking containment, and lean on letting user doing stupid things. On the other hand, Docker is supposed to be used by DevOps - we need tools that allow us to do things. Warnings what will happen if we abuse those tools welcome.

  3. If you didn't know for last 3 years how to do it the best way, chances are it won't be figured in the foreseeable future, if the passive wait for enlightenment continues. We need a solution that will work in an acceptable way now, on which further work can be based upon.

Solutions

Important things are:

  • support for getting the secret from different places/paths without modifying build files.
  • compatibility with different usage scenarios - even if that requires scripting from user, less assumptions from Docker about secret format the better

Mount

From user point of view, I like "MOUNT" option from rocker, or the "RUN --mount", as it allows to solve two of the issues: and build secrets AND caching for package managers. It requires from both provider and consumer only support for files, which I think is the most versatile. I would expect it to be a total circumvention of Docker build containment, having no effect on image hash and not being uploaded to daemon as part context. It is a secret needed for build - either build passes with it, or fails because the secret is wrong.

Problem is some people will try to use it for example to cache gcc object files for builds inside docker, which I think will lead to problems. Solution for that might be a good documentation why shouldn't they do that. Build mounts are somewhat similar to run-time volumes as ways to circumvent the containment and ephemeral nature of a Docker container, so maybe if users already have to grasp that concept they will get the intended use for build mounts.

If you want "only data that must be persisted goes into volume" approach, make the mounts read-only, and promote Dockerfile shenanigans such as copying dependency specifications earlier into build, and installing them onto an lower fs layer to solve (some of) caching issues. I'm currently experimenting with that, and it ain't pretty, but it works (or would if it had an SSH key) with no external cache to break down.

Secrets in /run

Loading secrets from files and putting them in /run/ or environment variables is also workable, but puts more limitations on supporting file formats, structure of the secrets, etc. I think it will either be too restrictive (e.g. no support for variable number of secrets, or hierarchy/structure) or it will boil down to more complicated mount -ro option, users will struggle with for no practical gain.

Advanced secret management

Another approach would be doing something akin to k8s config maps and secrets, with yaml files declaring secrets, and instructions if they are to be available as env values or files. This leaves the least room for user doing stupid things, allows future "out of the box" support for configuration providers like etcd, or different vaults. But, the builds also happen on local dev machines, that probably won't use those. Also build secrets are different from running secrets, to be discarded as soon as build stage is finished. Frankly I think running secrets should not be handled by Docker at all, but on different layer by orchestrators.

@hairyhenderson
Copy link
Contributor

@OJezu

Lack of this feature is crippling, there is no way to e.g. pull dependencies from private repos in a way that wouldn't be panned by any self-respecting auditor.

I disagree. There are a number of ways to accomplish this right now without any specific build-secrets feature. Note that these aren't all necessarily good ideas, or easy ideas, but it is possible to do securely, when done carefully:

  • squashed images (docker build --squash ...) - coupled with putting secrets in build-context and rming them before the end of the build
    • caveats: probably a good idea to disable layer caching, or make sure you purge the cache after builds. This can be effective on CI systems with regularly-recycled VMs
    • I don't think this method is officially sanctioned by Docker, but in theory it would result in an image that does not contain any secret files.
  • multi-stage builds (COPY --from=...) - early stages would contain secrets, the stuff extracted with said secrets would just be copied into the final stage, making sure to not copy secret material
    • caveats: again, caching gets in the way, but don't push your intermediate layers!
    • probably also not an officially-sanctioned method of dealing with secrets
  • use of a networked secret injection service, in conjunction with docker build --network
    • caveats: more complex!
    • for example, Vault or some other similar type of service - authenticated with one-time-use tokens passed as build args

(these examples are neither exhaustive nor complete, and I leave implementation details as an exercise to the reader)

All this to say, it would be useful to have an easier-to-use secure secret introduction mechanism for build-time. But there are secure ways to do this, given a few different factors.

Also, at the end of the day, consider what the build-time secret is protecting. Generally a secret isn't worth protecting in and of itself, it's whatever that secret gives access to. If you're stuffing all manner of valuable IP into a container image, then you need to be careful to protect that image anyway.

@OJezu
Copy link

OJezu commented Dec 14, 2017

In order to copy keys into image, (without networking) they must be first copied, or reside in context, which should never happen.

I need to use developers' ssh keys to access code repositories. That key is way more sensitive than the code. I've created additional set of keys that have read-only access to repositories, and I will be distributing that key with code that depends on those repositories. Because I refuse to copy someone's id_rsa into Docker context, even though I'm using multi-stage build. But this is absolutely not a solution, with some authorization keys going to all the places they should not be going, and being used by many different users and processes.

Now, I can use networking, mount volumes with NFS or other network storage (so, why is it not supported out of the box, with -v option, just as in docker run?), install etcd or other configuration/secret manager - but that is missing the point. I want to make things easier for the developers, not more complicated. I don't want to spend time troubleshooting some guy's network mount they need to copy one 1kB file!

Also

but it is possible to do securely, when done carefully:

simply means - no it's not possible. That's not how secure systems work.

@djmaze
Copy link
Contributor

djmaze commented Dec 14, 2017

It's quite easy to do this with build args. Just read the secret into a build arg when running the build command. In the Dockerfile, write the arg to a file, use it and remove it, all inside the same RUN command. Thus the secret never ends up in any image layer.

The following example uses an SSH key when bundling a Ruby app.

Dockerfile:

FROM ruby

WORKDIR /usr/src/app
COPY Gemfile Gemfile.lock ./

ARG SSH_KEY
RUN mkdir /root/.ssh \
 && chown 0700 /root/.ssh \
 && echo "$SSH_KEY" >/root/.ssh/id_rsa \
 && chmod 0600 /root/.ssh/id_rsa \
 && bundle install \
 && rm /root/.ssh -fR

Build with docker

$ docker build --build-arg SSH_KEY="$(cat /path/to/ssh-key)" -t my-app .

Build with Docker Compose

docker-compose.yml:

version: "3.3"
services:
  app:
    build:
      context: .
      args:
        SSH_KEY: "$SSH_KEY"

Build command:

SSH_KEY="$(cat /path/to/ssh-key)" docker-compose build

I admit that this is not an ideal solution, but it works quite perfectly.

@patrickf55places
Copy link

@djmaze I just tried verifying this, and it looks like the build arg value is present when running docker history.

@djmaze
Copy link
Contributor

djmaze commented Dec 14, 2017

@patrickf55places Ouch! Thanks, should have the documentation more thoroughly as well..

So this should at least be combined with multi-stage builds (as stated by @hairyhenderson), but that is not ideal either.

@OJezu
Copy link

OJezu commented Dec 15, 2017

Putting secrets in argument list is insecure, as they can be seen with running program listings, or by listing /proc.

@pvanderlinden
Copy link

I think that is the main problem. A lot people don't realise that almost everything leaves their secrets somewhere in the image or something else (either metadata, image layer, shell history). There are some possibilities, but they are all quiet a bit of work, and you have to be very careful. But to quote @OJezu :
but it is possible to do securely, when done carefully: simply means - no it's not possible. That's not how secure systems work. Someone will make a mistake, don't understand, don't care, and it won't be realised until it is too late.

I think the mount solution would be best, as this won't embed the secret in the history of the shell, the mount will also make sure anything can read that file, or it can be symlinked (in the worst case the symlink will still exist in the resulting image, but that won't contain the actual secret.

I think the main reason people don't comment much on these topics anymore is because the past 3 years has shown that every single good proposal, or ready to merge solution, is getting shutdown. The main reasons I have seen: fear of being misused which will break the reproducable build. In my opinion this is not a good reason, for 2 reasons:

  • The whole point of secrets is that you can't reproduce them unless you have access to them, this inherently breaks the reproducable build, but the secrets are necessarily for all kind of reasons
  • The whole reproducable build can't possibly be enforced by docker: docker can't guarantee that the context is still there with the Dockerfile (simply commit the Dockerfile to repository, not anything else). Also there is network (internet) access, for example to facilitate package repositories, this breaks the reproducable build as well.

I think the whole reason while a lot of people just get really frustrated, and deterred from even attempt to contribute to this issue, is the long history without a solution. I would be willing to contribute, if it would actually fix this problem. But I don't see this happening unless the main maintainers decide this is a real and (after 3 years) a very urgent issue.

@oppianmatt
Copy link

Isn't this issue relevant? #13490

But also doesn't multi stage builds help solve this? You do the first stage with secrets and then the second stage just uses the output from the first and the secrets are all scrubbed away.

@pvanderlinden
Copy link

#13490 : more then 2 years old, it also refers back to this topic as preferred comment thread.
The multi stage build could be used, but is also error prone, so not really a good option for secrets (mistakes are made easily), it is better to never save the secret to an actual image (which will still be done).

@patrickf55places
Copy link

patrickf55places commented Dec 15, 2017

@oppianmatt Using multi-stage builds for this requires you to either:

  1. Know every file that changed from the base image that is needed in the final image
  2. Copy the entire root directory

Option 1 is difficult because it requires in-depth knowledge of what is being built. For example, if you need to install third-party packages (e.g. with apk add, yum install or apt-get install), you need to know where all the files from the packages are being placed, which can change from package version to package version. This means that successive docker builds are much more prone to breaking without notice.

Option 2 breaks the image layer hierarchy, as every file in the image is changed in the new layer (it is really only useful for the final image to be FROM scratch in that case). This means that each new image can potentially use hundreds of MB of storage (docker save ubuntu:16.04 generates a 109 MiB tarball; docker save php:7.2 generates a 334 MiB tarball).

@matthewbarr
Copy link

And neither of those allows for a persistent ephemeral disk cache of packages, forcing people to download the same packages every time, and do complicated gymnastics to avoid storing them in the image.

Using a volume mount (or a bind, in certain cases) would be fantastic for dealing w/ apt caches, and ruby, python etc. Maven too.

People already plan for the optional presence of volumes for all kinds of things today in finished images / containers. ex: mysql defaults to using an internal directory, but it perfectly happy to deal w/ a volume instead.

@OJezu
Copy link

OJezu commented Dec 15, 2017

@patrickf55places @matthewbarr Yes, those are also concerns for Docker, but both can be solved within Dockerfile.

Multi-stage builds can be done in multiple steps, easily with (undocumented?, I don't know I found it by experimenting) building from earlier images.

FROM debian AS image-common
# runtime deps
FROM image-common AS image-build
# build deps and build itself
FROM image-common AS final-build
COPY --from=image-build ./build-result ./

Lack of caching can be solved, or at least mitigated by downloading dependencies in an earlier step then copying application source. Benefit over mounting cache volume is not risking breaking of docker internal cache in any way. For PHP an example goes:

COPY composer.json composer.lock ./ # copy dependency spec first
RUN composer install  --no-scripts --no-plugins --no-autoloader --no-suggest --no-interaction
COPY . ./ # copy all of the sources

I cannot find an acceptable solution for copying secrets though, and this is what this issue is about. Please, let us not dilute it with other problems. Maintainers must decide which solution should be pursued, and either do it or accept PR from community if somebody else does it.

@dserodio
Copy link

@OJezu, multi-stage builds are documented, and were "advertised" on the blog too.

@srstsavage
Copy link

srstsavage commented Dec 15, 2017

Multi-stage builds are wonderful and well documented, but if build args + multi-stage builds are the recommended solution for build time secrets then this should be spelled out somewhere in the docs.

#32507 seems to be in the spirit of Docker/Moby...clean, flexible, and powerful. It solves this secrets problem and a host of other problems as well (e.g. avoiding copying files into the image which only need to be present for a single step). That gets my vote.

@OJezu
Copy link

OJezu commented Dec 18, 2017

@dserodio I didn't see this:

FROM debian AS image-common
FROM image-common AS image-build

in documentation.
COPY --from <earlier-stage-alias> is documented, while FROM <earlier-stage-alias> is not.

@Vanuan
Copy link

Vanuan commented Dec 18, 2017

Side note:

I need to use developers' ssh keys to access code repositories.

It looks like you're an ops guy trying to setup development environment on developer's machine. Is that really what you're trying to do? Isn't that developer's job in DevOps culture?

Docker is supposed to be used by DevOps - we need

It looks like you call yourself a DevOps. But DevOps is not a synonym for Ops (i.e. "system administrator" or just "admin"). DevOps is a culture where Operations and Developers work together. "I'm DevOps" sounds similar to "I'm Agile" or "I'm Scrum". It doesn't make any sense.

And no, Docker is not meant to be used by Ops exclusively.


ssh access

Consider your request is satisfied and now you're able to mount any file at build time. The next thing your security compliance guys will tell you that ssh keys shouldn't just seat at the disk in plain text. They should be encrypted with user's local password. So the next thing you ask from Docker is an ability to provide user password or otherwise unlock the ssh key. So you will be frustrated again.

For ssh keys we need an ability to unix sockets at build time so that we can provide SSH_AUTH_SOCK.

But the problem here is that Macs don't support unix sockets: docker/for-mac#483

So we need to go deeper.
The only solution that securely works:

  • run ssh server in docker
  • connect from local machine to that ssh server mounting SSH_AUTH_SOCK
  • mount SSH_AUTH_SOCK to the container where you need an ssh access.

ssh in docker

I don't see how the discussed solution solves developer's SSH access problem. Unless you propose to store ssh keys on disk unencrypted.

@mshappe
Copy link

mshappe commented Dec 18, 2017

It looks like you're an ops guy trying to setup development environment on developer's machine. Is that really what you're trying to do? Isn't that developer's job in DevOps culture?

I can't speak to his use case, but I can speak to the one I was trying to solve uptopic, and it had nothing to do with building for development machines. It had to do with the fact that our closed source project depends on other closed-source repositories of our own as gems and/or npm packages. Secrets are necessary to pull these packages at build-time no matter what the environment is.

I wound up finding a different way to solve the problem, but the fact that this issue remains not only unaddressed but virtually blown off is an absurdity.

@Vanuan
Copy link

Vanuan commented Dec 18, 2017

It looks like we either have to allocate a separate ssh key for docker daemon or embed ssh agent/daemon into docker client/daemon.

Alternatively we have to abandon idea of building images in isolated environment and use client-side build.

Transferring or otherwise mounting ssh key from client to server first decrypting it on the client doesn't sound like a robust and secure solution.

@cpuguy83
Copy link
Member

I've looked at ssh-agent forwarding to a container over a docker API connection before...
There were some issues, but theoretically this should be possible.

@tonistiigi
Copy link
Member Author

@cpuguy83 @Vanuan tonistiigi@a175773 (from first post).

@Vanuan
Copy link

Vanuan commented Dec 18, 2017

@tonistiigi cool! Does it work on Docker for Mac? On Windows/Cygwin?
Is there something similar for run time ssh?

@tonistiigi
Copy link
Member Author

@Vanuan Yes, it only uses the remote API, no hidden magic for sharing local socket etc.

@Vanuan
Copy link

Vanuan commented Dec 18, 2017

So this could potentially solve this issue too: #6396
The only missing thing is UI for build, run, compose (but probably not useful for swarm as its secrets are stored securely)

Limitations: some CI systems don't provide ssh agent.

@pvanderlinden
Copy link

That still leaves the issue of private package repositories open, and probably other situations as well. My problem is similar to @mshappe , not npm and gem, but pypi and conda.

@tonistiigi
Copy link
Member Author

Limitations: some CI systems don't provide ssh agent.

You don't necessarily need ssh agent on the client (the current commit does). The client could do the same by just implementing Sign() from https://godoc.org/golang.org/x/crypto/ssh/agent#Agent based on some passed configuration.

@3XX0
Copy link

3XX0 commented Mar 10, 2018

Not sure what's the status of this, but I faced a similar issue recently while trying to sign stuff inside a docker build.
If someone is interested, here is how I solved it: https://github.com/3XX0/donkey

@mumoshu
Copy link

mumoshu commented Mar 10, 2018

@3XX0 Looks awesome!

But in case you you miss it - JFYI, habitus seems to solve the problem in a same way, while supporting advanced builds.

Perhaps we'd better merge our efforts whenever possible just for purpose of easy maintenance 😄

@pvanderlinden
Copy link

@3XX0 This is exactly why there needs to be a good solution for this. Your solution doesn't work unfortunately: it will embed the value of the secret in your image metadata, making it public to anyone who can get the image.

@3XX0
Copy link

3XX0 commented Mar 12, 2018

@pvanderlinden How so? Feel free to open an issue if you think that's the case

@Tyrael
Copy link

Tyrael commented Mar 12, 2018

my guess is that he thinks that in your example the shellcode of the build-argument literally evaluates to your secret instead of the random id which can be used in your Dockerfile to donkey get the secret which if done properly won't end up in the image

@pvanderlinden
Copy link

Ah sorry, this works indeed. It's a good work around, as long as docker doesn't acknowledge that this is an important feature.

@gwarnes-mdsol
Copy link

My colleague Benton Roberts (broberts@mdsol.com) created a simple tool for Secure SSH key injection into Docker builds: https://github.com/mdsol/docker-ssh-exec

@thaJeztah thaJeztah added kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Nov 1, 2018
@AkihiroSuda
Copy link
Member

docker build --secret is finally available in Docker 18.09 https://medium.com/@tonistiigi/build-secrets-and-ssh-forwarding-in-docker-18-09-ae8161d066

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests