Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerfile: avoid ignoring errors global arg expansion #4856

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tonistiigi
Copy link
Member

closes #4847

@tonistiigi tonistiigi marked this pull request as ready for review April 18, 2024 00:48
@tonistiigi
Copy link
Member Author

The second commit is to make evaluation of default value only happen if the default is actually used. Without it you can't do ARG FOO=${FOO:?} directly but would need to do:

ARG FOO
ARG FOO=${FOO:?}

Because even if build-arg is passed by the user it would not be part of env when the ARG call first happens.

@thaJeztah
Copy link
Member

thaJeztah commented Apr 18, 2024

Thanks! What's the easiest way to build/try these changes? (if I'd want to try the new behavior)?

Also, perhaps it's good to copy one of my comments from the ticket that I just posted; some of that only (somewhat) occurred to me when replying there (#4847 (comment));

Some of these I added to see if evaluation was "early-binding" or "late-binding" (for better words), basically, like = vs := in Make. Maybe that would even be the "full" solution for this ambiguity, although I'm not sure if we still can change that. Currently (at least as far as global ARG goes), it's late-binding, which would match = in Make; if we change that, it means we would for sure paint shut the possibility of introducing := if we would make that behavior controllable by the user.

Basically;

  • Do we consider build-args to be late-binding (again, for better words) or not? So only to be evaluated when used, and if used, only to be evaluated when executing?
  • ☝️ currently (at least before this PR), I think we "roughly" do so, or at least for the error, but with this PR, this may change?
  • ☝️ would we ever see ourselves wanting to introduce := vs = (using "Make" conventions as prior art)?

I have no strong opinions either way on this, but also wanted to at least put my train of thought here, just in case this is something we would / could consider.

👉 whatever direction we'd go in, we should probably look at our documentation, and see if we already outline the behavior, or if not, have some docs around this.

@tonistiigi
Copy link
Member Author

What's the easiest way to build/try these changes?

#syntax=tonistiigi/dockerfile:pr4856

ARG FOO=${FOO:?"you should set foo"}

FROM alpine

Do we consider build-args to be late-binding (again, for better words) or not?

ARG/ENV behave like variable definitions in bash (or any other language for that matter). They do not behave like macro-style recursively expanded variables in Makefile.

So only to be evaluated when used, and if used, only to be evaluated when executing?

The value is evaluated when defined. You still need to reach the line that defines ARG/ENV. Eg. if the stage is unused then ARG/ENV inside it will not be reached by buildkit (but will by the legacy builder that doesn't have --target based optimization). But the evaluation does not depend on where the ARG/ENV is used later.

would we ever see ourselves wanting to introduce := vs = (using "Make" conventions as prior art)?

No, I don't see any case atm. where the extra syntax would provide something that outweighs the added complexity. This is Make-specific feature that gets misused in most Makefiles.

@thaJeztah
Copy link
Member

#syntax=tonistiigi/dockerfile:pr4856

Thanks! Trying my Dockerfile from #4847 :

# syntax=tonistiigi/dockerfile:pr4856

# REQUIRED_ARG1 is required global arg. It is not used in any stage below,
# but I expect it to produce an error if not set.
ARG REQUIRED_ARG1=${REQUIRED_ARG1:?}

# REQUIRED_ARG2 is required global arg. It is overridden in stage two, and
# I expect it to produce an error if not set.
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}

# REQUIRED_ARG2 is required global arg. It is used in stage three, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG3=${REQUIRED_ARG3:?}

# REQUIRED_ARG4 is required global arg. It is used in stage four, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG4=${REQUIRED_ARG4:?}


FROM ubuntu:16.04 AS one
RUN printenv | grep REQUIRED || true

FROM ubuntu:18.04 AS two
RUN printenv | grep REQUIRED || true
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}
RUN printenv | grep REQUIRED || true

FROM ubuntu:20.04 AS three
RUN printenv | grep REQUIRED || true
ARG REQUIRED_ARG3
RUN printenv | grep REQUIRED || true

FROM ubuntu$REQUIRED_ARG4 AS four
RUN printenv | grep REQUIRED || true

FROM ubuntu${REQUIRED_ARG5:?} AS five
RUN printenv | grep REQUIRED || true

And trying to build stage one;

docker build --no-cache --progress=plain --target=one .
#0 building with "desktop-linux" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 1.20kB done
#1 DONE 0.0s

#2 resolve image config for docker-image://docker.io/tonistiigi/dockerfile:pr4856
#2 ...

#3 [auth] tonistiigi/dockerfile:pull token for registry-1.docker.io
#3 DONE 0.0s

#2 resolve image config for docker-image://docker.io/tonistiigi/dockerfile:pr4856
#2 DONE 1.1s

#4 docker-image://docker.io/tonistiigi/dockerfile:pr4856@sha256:4aac71cb41aa46b049abf9ae1d0ec3bbb461220c6467d65adb112a082f6e980a
#4 resolve docker.io/tonistiigi/dockerfile:pr4856@sha256:4aac71cb41aa46b049abf9ae1d0ec3bbb461220c6467d65adb112a082f6e980a done
#4 CACHED
Dockerfile:5
--------------------
   3 |     # REQUIRED_ARG1 is required global arg. It is not used in any stage below,
   4 |     # but I expect it to produce an error if not set.
   5 | >>> ARG REQUIRED_ARG1=${REQUIRED_ARG1:?}
   6 |
   7 |     # REQUIRED_ARG2 is required global arg. It is overridden in stage two, and
--------------------
ERROR: failed to solve: failed to process "${REQUIRED_ARG1:?}": REQUIRED_ARG1: is not allowed to be unset


docker build --no-cache --progress=plain --build-arg REQUIRED_ARG1="" --target=one .
...
Dockerfile:9
--------------------
   7 |     # REQUIRED_ARG2 is required global arg. It is overridden in stage two, and
   8 |     # I expect it to produce an error if not set.
   9 | >>> ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}
  10 |
  11 |     # REQUIRED_ARG2 is required global arg. It is used in stage three, and I expect
--------------------
ERROR: failed to solve: failed to process "${REQUIRED_ARG2:?}": REQUIRED_ARG2: is not allowed to be unset


docker build --no-cache --progress=plain --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --target=one .
...
Dockerfile:13
--------------------
  11 |     # REQUIRED_ARG2 is required global arg. It is used in stage three, and I expect
  12 |     # it to produce an error if not set.
  13 | >>> ARG REQUIRED_ARG3=${REQUIRED_ARG3:?}
  14 |
  15 |     # REQUIRED_ARG4 is required global arg. It is used in stage four, and I expect
--------------------
ERROR: failed to solve: failed to process "${REQUIRED_ARG3:?}": REQUIRED_ARG3: is not allowed to be unset

docker build --no-cache --progress=plain --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --target=one .
...
Dockerfile:17
--------------------
  15 |     # REQUIRED_ARG4 is required global arg. It is used in stage four, and I expect
  16 |     # it to produce an error if not set.
  17 | >>> ARG REQUIRED_ARG4=${REQUIRED_ARG4:?}
  18 |
  19 |
--------------------
ERROR: failed to solve: failed to process "${REQUIRED_ARG4:?}": REQUIRED_ARG4: is not allowed to be unset

docker build --no-cache --progress=plain --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --build-arg REQUIRED_ARG4="" --target=one .
...
Dockerfile:36
--------------------
  34 |     RUN printenv | grep REQUIRED || true
  35 |
  36 | >>> FROM ubuntu${REQUIRED_ARG5:?} AS five
  37 |     RUN printenv | grep REQUIRED || true
  38 |
--------------------
ERROR: failed to solve: failed to process "ubuntu${REQUIRED_ARG5:?}": REQUIRED_ARG5: is not allowed to be unset

So far so good? (narrator: no, it may actually not be good yet (see further down))

And then I got surprised by this one;

docker build --no-cache --progress=plain --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3=y --build-arg REQUIRED_ARG4="" --build-arg REQUIRED_ARG5="" --target=one .
...
Dockerfile:35
--------------------
  33 |     RUN printenv | grep REQUIRED || true
  34 |
  35 | >>> FROM ubuntu${REQUIRED_ARG5:?} AS five
  36 |     RUN printenv | grep REQUIRED || true
  37 |
--------------------
ERROR: failed to solve: failed to process "ubuntu${REQUIRED_ARG5:?}": REQUIRED_ARG5: is not allowed to be unset

... and then I found that my Dockerfile is likely faulty! Because I'm not declaring ARG REQUIRED_ARG5, so I added it;

# syntax=tonistiigi/dockerfile:pr4856

# REQUIRED_ARG1 is required global arg. It is not used in any stage below,
# but I expect it to produce an error if not set.
ARG REQUIRED_ARG1=${REQUIRED_ARG1:?}

# REQUIRED_ARG2 is required global arg. It is overridden in stage two, and
# I expect it to produce an error if not set.
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}

# REQUIRED_ARG2 is required global arg. It is used in stage three, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG3=${REQUIRED_ARG3:?}

# REQUIRED_ARG4 is required global arg. It is used in stage four, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG4=${REQUIRED_ARG4:?}

ARG REQUIRED_ARG5

FROM ubuntu:16.04 AS one
RUN printenv | grep REQUIRED || true

FROM ubuntu:18.04 AS two
RUN printenv | grep REQUIRED || true
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}
RUN printenv | grep REQUIRED || true

FROM ubuntu:20.04 AS three
RUN printenv | grep REQUIRED || true
ARG REQUIRED_ARG3
RUN printenv | grep REQUIRED || true

FROM ubuntu$REQUIRED_ARG4 AS four
RUN printenv | grep REQUIRED || true

FROM ubuntu${REQUIRED_ARG5:?} AS five
RUN printenv | grep REQUIRED || true
docker build --no-cache --progress=plain  --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --build-arg REQUIRED_ARG4="" --build-arg REQUIRED_ARG5="" --target=one .
...
Dockerfile:37
--------------------
  35 |     RUN printenv | grep REQUIRED || true
  36 |
  37 | >>> FROM ubuntu${REQUIRED_ARG5:?} AS five
  38 |     RUN printenv | grep REQUIRED || true
  39 |
--------------------
ERROR: failed to solve: failed to process "ubuntu${REQUIRED_ARG5:?}": REQUIRED_ARG5: is not allowed to be empty

Note the different error ("is not allowed to be empty"); which is when I realized I used :? in my examples, not ?, so the value has to be both set, and a non-empty value;

echo ${REQUIRED_ARG5?}
bash: REQUIRED_ARG5: parameter null or not set

export REQUIRED_ARG5="" && echo ${REQUIRED_ARG5?}

export REQUIRED_ARG5="" && echo ${REQUIRED_ARG5:?}
bash: REQUIRED_ARG5: parameter null or not set

Re-running the build with a non-empty value made it work;

docker build --no-cache --progress=plain  --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --build-arg REQUIRED_ARG4="" --build-arg REQUIRED_ARG5=":latest" --target=one .
...
#9 exporting to image
#9 exporting layers 0.0s done
...
#9 DONE 0.1s

So that error is correct; however; that also means that the earlier checks seem to be incorrect; all of the global "required" ARG are defined to be "required" (must be set) AND set to a non-empty value;

# REQUIRED_ARG1 is required global arg. It is not used in any stage below,
# but I expect it to produce an error if not set.
ARG REQUIRED_ARG1=${REQUIRED_ARG1:?}

# REQUIRED_ARG2 is required global arg. It is overridden in stage two, and
# I expect it to produce an error if not set.
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}

# REQUIRED_ARG2 is required global arg. It is used in stage three, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG3=${REQUIRED_ARG3:?}

# REQUIRED_ARG4 is required global arg. It is used in stage four, and I expect
# it to produce an error if not set.
ARG REQUIRED_ARG4=${REQUIRED_ARG4:?}

Maybe that's because they're not used for stage one, so trying with stage two (which uses REQUIRED_ARG2);

FROM ubuntu:18.04 AS two
RUN printenv | grep REQUIRED || true
ARG REQUIRED_ARG2=${REQUIRED_ARG2:?}
RUN printenv | grep REQUIRED || true

But that also doesn't detect it, and interestingly it also doesn't detect non-empty in the stage itself (which also uses the same);

docker build --no-cache --progress=plain  --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --build-arg REQUIRED_ARG4="" --build-arg REQUIRED_ARG5=":latest" --target=two .
...

#6 [two 1/3] FROM docker.io/library/ubuntu:18.04@sha256:152dc042452c496007f07ca9127571cb9c29697f42acbfad72324b2bb2e43c98
#6 resolve docker.io/library/ubuntu:18.04@sha256:152dc042452c496007f07ca9127571cb9c29697f42acbfad72324b2bb2e43c98 done
#6 sha256:064a9bb4736de1b2446f528e4eb37335378392cf9b95043d3e9970e253861702 0B / 22.71MB 0.2s
#6 sha256:064a9bb4736de1b2446f528e4eb37335378392cf9b95043d3e9970e253861702 7.34MB / 22.71MB 0.3s
#6 sha256:064a9bb4736de1b2446f528e4eb37335378392cf9b95043d3e9970e253861702 22.71MB / 22.71MB 0.4s done
#6 extracting sha256:064a9bb4736de1b2446f528e4eb37335378392cf9b95043d3e9970e253861702
#6 extracting sha256:064a9bb4736de1b2446f528e4eb37335378392cf9b95043d3e9970e253861702 0.3s done
#6 DONE 0.8s

#7 [two 2/3] RUN printenv | grep REQUIRED || true
#7 DONE 0.2s

#8 [two 3/3] RUN printenv | grep REQUIRED || true
#8 0.137 REQUIRED_ARG2=
#8 DONE 0.1s

#9 exporting to image
#9 exporting layers 0.0s done
#9 DONE 0.1s

Trying with stage three which only uses a "plain" ARG REQUIRED_ARG3 (so no additional check in the stage itself); but also no error in that case;

docker build --no-cache --progress=plain  --build-arg REQUIRED_ARG1="" --build-arg REQUIRED_ARG2="" --build-arg REQUIRED_ARG3="" --build-arg REQUIRED_ARG4="" --build-arg REQUIRED_ARG5=":latest" --target=three .
...

#9 [three 2/3] RUN printenv | grep REQUIRED || true
#9 DONE 0.1s

#10 [three 3/3] RUN printenv | grep REQUIRED || true
#10 0.128 REQUIRED_ARG3=
#10 DONE 0.1s

#11 exporting to image
#11 exporting layers 0.0s done
...
#11 DONE 0.1s

So it looks like there's 2 issues (maybe 3);

  • ❌ Args in the global scope don't validate for non-empty values
  • ❌ When used as part of a stage, non-empty values are also not checked for
  • ☝️ however, this check DOES happen when used in FROM
  • ❌ if FROM uses a build-arg that is not defined, no error is produced; trying to set the build-arg doesn't produce an error, and is treated the same as "not set"

I can open a separate ticket for that last one if you think it's unrelated.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi
Copy link
Member Author

@daghack You can check again as I rebased against your recent changes.

@tonistiigi tonistiigi requested a review from crazy-max May 18, 2024 00:17
@tonistiigi tonistiigi added this to the v0.14.0 milestone May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dockerfile: required ARG not validated in global space
3 participants