Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROPOSAL - Dockerfile Linting #4823

Open
colinhemmings opened this issue Apr 3, 2024 · 5 comments
Open

PROPOSAL - Dockerfile Linting #4823

colinhemmings opened this issue Apr 3, 2024 · 5 comments
Assignees
Milestone

Comments

@colinhemmings
Copy link
Collaborator

Enhance the Dockerfile frontend to support evaluating a Dockerfile for common issues and violations of best practices. We believe making people aware of these issues as soon as possible will save a lot of time investigating and help them learn how to write better Dockerfiles.

We see many people struggling to create a well-structured Dockerfile that is maintainable and they commonly run into issues because they are not following the best practices for writing Dockerfiles, such as:

  • Consistent casing for instructions, stage names, and flags
  • The correct use for relative WORKDIR
  • Using secrets in ARGs
  • Use of undeclared ARG in FROM
  • and many more

The Dockerfile frontend is a great place to support evaluating these rules because it understands the structure of the Dockerfile and inherited base images. This update would support evaluating rules during and catching any issues during CI, as well as support dry-run evaluation in the local environment.

This initial proposal would have two components:

  • Defining an agreed list of rules and their severity
  • Implementing the evaluation logic during a build and as a dry run without fully executing a build.

Here is a draft PR for the proposal

Please add your thoughts on the proposal, including suggestions or additional functionality you would like to see.

Initial Proposed Rules List

This is a list of initial Dockerfile rules.

1. stage_ordering

Stages should be correctly ordered.

2. format_empty_continuation_line_disallowed

Empty continuation line disallowed. Reduces readability and may lead to unexpected behavior.

3. instruction_casing

Instructions should be written in consistent casing. Improves readability

4. flag_casing

Flags should be written in lowercase. Improves readability

5. stage_casing

Stage names should written in lowercase. Improves readability

6. workdir_relative_path

Relative WORKDIR must be preceded by an absolute WORKDIR. A relative WORKDIR can have unexpected results if not first defining an absolute WORKDIR and base image changes.

7. arg_secrets_disallowed

Passing secrets via ARGs is not allowed It’s not recommended to use credentials in arguments because they are visible in the docker history.

8. arg_undeclared

Usage of undeclared ARG in FROM

9. arg_invalid_reference

Invalid ARG reference. All FROM with default ARG values should make valid references.

10. arg_invalid_reference_typo

Invalid ARG reference. Did you mean ?

Unused arg value passed that looks like a typo of defined arg

11. entrypoint_json_args_required

JSON arguments are required for CMD| ENTRYPOINT unless using the custom shell.

Shell form prevents OS signals from being correctly passed to the executables.

12. cmd_json_args_required

JSON arguments required for CMD| ENTRYPOINT, unless using custom shell.

Shell form prevents OS signals from being correctly passed to the executables.

13. multiple_heathcheck_disallowed

Multiple HEALTHCHECK instructions are disallowed. Avoid multiple definitions

14. multiple_cmd_disallowed

Multiple CMD instructions are disallowed. Avoid multiple definitions

15. multiple_entrypoint_disallowed

Multiple ENTRYPOINT instructions are disallowed. Avoid multiple definitions

16. stage_self_link_disallowed

Linking to current stage name is disallowed

17. stage_name_duplicate_disallowed

Duplicate stage name . Stage names should be unique.

18. stage_reserve_name_disallowed

Reserved stage name "scratch," "context.”

19. from_platform_flag_const_disallowed

--platform flag should not be used with a constant value unless the stage name also contains os or arch

20. from_platform_default_disallowed

Unnecessary --platform=$TARGETPLATFORM flag, this is the default behavior.

@tonistiigi
Copy link
Member

As pointed out by @tianon docker/cli#2743 (comment) , we have 2 "deprecated-not-really-deprecated" Dockerfile features in https://github.com/docker/cli/blob/master/docs/deprecated.md that should also get a linter warning.

https://github.com/docker/cli/blob/v26.0.0/docs/deprecated.md#dockerfile-legacy-env-name-value-syntax
https://github.com/docker/cli/blob/v26.0.0/docs/deprecated.md#maintainer-in-dockerfile

@Speeddymon
Copy link

Speeddymon commented Apr 11, 2024

Encourage using build stages, and using scratch images for build target stages? Ref: #4834 regarding a requirement for designating build target stages within Dockerfile

@Speeddymon
Copy link

  1. arg_invalid_reference_typo

Idea:

Detect if any declared vars (ARG or ENV) are too similarly named that it makes the code more difficult to follow

  1. cmd_json_args_required

Idea:

JSON arguments should be required for RUN when no variables are used in the RUN command.

@tonistiigi
Copy link
Member

Encourage using build stages, and using scratch images for build target stages?

This way too opinionated and app-specific. Nothing wrong in using FROM scratch but also nothing wrong in using a base image, and in many cases using base image is the right or only possible/correct solution.

JSON arguments should be required for RUN when no variables are used in the RUN command.

Why?


Based on some content in #4834 , I'd propose a rule that detects if there is a unreachable stage that is not named and not used in any of the --from flags by index. Not sure if there are many such Dockerfiles out there, but if there are we could detect them.

@Speeddymon
Copy link

Speeddymon commented Apr 13, 2024

This way too opinionated and app-specific.

It's an encouragement, so IMHO it would be like an INFO level of importance. Not sure if you've considered having different priorities/severities/levels of findings, but many general purpose linting tools tend to have some sort of system for grouping findings by importance. If that were to be the case here, I'd consider it valid to have it at a low importance that people could easily ignore.

Why?

If a run command is short and takes no variables and doesn't depend on a shell in any way, why require the shell to be called up by Buildkit to run the command in the layer? Simple commands like useradd someuser where the user is not a variable; those commands don't need to be called by a shell, so why call it through the shell? It's simple enough to change it from RUN useradd someuser to RUN ["/usr/bin/useradd", "someuser"] and save a shell invocation. This may also be important if there isn't a shell in the given layer (nor mounted in for use) where the RUN command is being used.

@thompson-shaun thompson-shaun added this to the v0.14.0 milestone May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants