Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Docker image and push to GHCR #230

Open
wants to merge 10 commits into
base: unstable/v1
Choose a base branch
from

Conversation

br3ndonland
Copy link

@br3ndonland br3ndonland commented Apr 19, 2024

Description

Closes #58

Up to this point, the project has been set up as a Docker action referencing the Dockerfile.

runs:
using: docker
image: Dockerfile

The downside to using the Dockerfile for the action is that the Docker image must be built every time the action is used (#58).

This PR will set up the project to build the Docker image and push it to GitHub Container Registry (GHCR). This change will speed up user workflows every time the action is used because the workflows will simply pull the Docker image from GHCR instead of building again.

Changes

Build container image with GitHub Actions

This PR will build Docker images with the Docker CLI (docker build). Builds will include inline cache metadata so layers can be reused by future builds.

This PR only proposes to build container images for x86_64 (linux/amd64) because GitHub Actions Linux runners currently only support x86_64 CPU architectures (actions/runner-images#5631), and this project only supports GitHub Actions Linux runners. The README explains:

Since this GitHub Action is docker-based, it can only be used from within GNU/Linux based jobs in GitHub Actions CI/CD workflows. This is by design and is unlikely to change due to a number of considerations we rely on.

Push container image to GHCR

The workflow will log in to GHCR using the built-in GitHub token and push the Docker image. Workflow runs triggered by pull requests will build the Docker image and run the smoke tests but will not push the Docker image.

Update action to pull container image from GHCR

Docker actions support pulling in pre-built Docker images by supplying a registry address to the image: key. The downside to this syntax is that there's no way to specify the correct Docker tag because the GitHub Actions image: and uses: keys don't accept any context. For example, if a user's workflow has uses: pypa/gh-action-pypi-publish@release/v1.8, then the action should pull in a Docker image built from the release/v1.8 ref, something like ghcr.io/pypa/gh-action-pypi-publish:release-v1.8 (Docker tags can't have /).

# this works but the image tag can't be customized
runs:
  using: docker
  image: docker://ghcr.io/pypa/gh-action-pypi-publish:release-v1.8
# this doesn't work because `image:` doesn't support context
runs:
  using: docker
  image: docker://ghcr.io/pypa/gh-action-pypi-publish:${{ github.action_ref }}

The workaround is to switch the top-level action.yml to a composite action that then calls the Docker action, substituting the correct image name and tag.

Related

@webknjaz
Copy link
Member

This looks.. intriguing! I don't remember if I ever considered combining composite+docker actions (I did play with having two composites in the same repo in the past, though).

I'll need to take some time to think about it and look through the patch more closely. Please, don't expect an immediate review, however it does look very promising at glance!

Originally I thought that I'd have a workflow where I trigger a release, that release adds a commit that hardcodes an update to action.yml with the "future" version tag, tags that commit and pushes it (post docker publish). It wouldn't be on the main branch, the tags would be on the orphaned leaves.

This looks like a better idea so far. Thanks again!

br3ndonland added a commit to br3ndonland/gh-action-pypi-publish that referenced this pull request Apr 26, 2024
@br3ndonland
Copy link
Author

That sounds great. Take your time. Thanks for your consideration.

If you do decide to accept this change, I'm happy to help maintain the workflows in the future. Feel free to mention me @br3ndonland and I will help address any issues that come up.

Up to this point, the project has been set up as a Docker action
referencing the Dockerfile. The downside to using the Dockerfile for the
action is that the Docker image must be built every time the action is
used.

This commit will set up the project to build the Docker image and push
it to GitHub Container Registry (GHCR). This change will speed up user
workflows every time the action is used because the workflows will
simply pull the Docker image from GHCR instead of building again.

Changes:

- Add required metadata to Dockerfile
- Build container image with GitHub Actions
- Push container image to GHCR

Docker actions support pulling in pre-built Docker images. The downside
is that there's no way to specify the correct Docker tag because the
GitHub Actions `image` and `uses:` keys don't accept any context.
For example, if a user's workflow has
`uses: pypa/gh-action-pypi-publish@release/v1.8`, then the action should
pull in a Docker image built from the `release/v1.8` branch, something
like `ghcr.io/pypa/gh-action-pypi-publish:release-v1.8` (Docker tags
can't have `/`). The workaround is to switch the top-level `action.yml`
to a composite action that then calls the Docker action, substituting
the correct image name and tag.
@webknjaz
Copy link
Member

Thanks! I've hit "rebase" on the UI to get this on top of the recent changes/linting/lockfile bumps but haven't yet looked into it deeper.

--tag $IMAGE
- name: Log in to GHCR
if: github.event_name != 'pull_request'
run: |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually prefer this syntax so that the whole thing is parsed as a single line:

Suggested change
run: |
run: >-

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I don't use run: >- because it will fold the contents of the block into a single line. This can alter the behavior of the script within the block. On this line though, it makes no difference, so sure.

Comment on lines 36 to 44
args:
- ${{ inputs.user }}
- ${{ inputs.password }}
- ${{ inputs.repository-url }}
- ${{ inputs.packages-dir }}
- ${{ inputs.verify-metadata }}
- ${{ inputs.skip-existing }}
- ${{ inputs.verbose }}
- ${{ inputs.print-hash }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't actually used. Somebody contributed them, but I don't see them being needed. So perhaps we shouldn't keep them around anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the inputs used by the entrypoint script?

INPUT_REPOSITORY_URL="$(get-normalized-input 'repository-url')"
INPUT_PACKAGES_DIR="$(get-normalized-input 'packages-dir')"
INPUT_VERIFY_METADATA="$(get-normalized-input 'verify-metadata')"
INPUT_SKIP_EXISTING="$(get-normalized-input 'skip-existing')"
INPUT_PRINT_HASH="$(get-normalized-input 'print-hash')"

Comment on lines 7 to 9
push:
branches: ["release/*", "unstable/*"]
tags: ["*"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we push the tag first, there's going to be some time when the new version is out already, but the container doesn't actually exist yet. Therefore, I prefer building my release automation processes around the workflow_dispatch trigger, so that I'm able to type in the target version in the UI, get the image published and only then push a tag, only when it's known that the image exists already. Let's try reworking this with similar philosophy in mind.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. I've replaced the tags trigger with workflow_dispatch instead (049447a).

pull_request:
workflow_run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two separate workflow runs is often hard to track. Instead, I adopted a practice of modularizing the workflow pieces as reusable workflows having the reusable- prefix in their names. This allows embedding everything in all the right places. Let's try this, WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will return to this suggestion at a later time.

uses: actions/checkout@v3
with:
path: test
uses: actions/checkout@v4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not make version upgrades in the same PR? And I'd rather not change the testing logic/dirs unless it's required somehow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll stick with actions/checkout@v3 and keep the previous test path.

@@ -31,7 +31,7 @@ repos:
args:
- --builtin-schema
- github-workflows-require-timeout
files: ^\.github/workflows/[^/]+$
files: ^\.github\/workflows/[^/]+$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you submit this separately?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this change. The existing regex (^\.github/workflows/[^/]+$) seems like it should work fine, but pre-commit was matching files in .github/actions/ and raising errors. Dynamically generating the entire action.yml as you suggested will avoid the problem.

Dockerfile Outdated
@@ -3,6 +3,7 @@ FROM python:3.12-slim
LABEL "maintainer" "Sviatoslav Sydorenko <wk+pypa@sydorenko.org.ua>"
LABEL "repository" "https://github.com/pypa/gh-action-pypi-publish"
LABEL "homepage" "https://github.com/pypa/gh-action-pypi-publish"
LABEL "org.opencontainers.image.source" "https://github.com/pypa/gh-action-pypi-publish"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd accept this in a separate PR even before the rest is figured out/refactored.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to separate PR (#241).

steps:
- name: Reset path if needed
run: |
# Reset path if needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this be needed outside the container?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you've set up a test that modifies the $PATH (#112, 1350b8b).

- name: ✅ Smoke-test the locally checked out action
uses: ./test
env:
DEBUG: >-
true
PATH: utter-nonsense

Comment on lines +107 to +114
REF=${{ env.ACTION_REF || github.ref_name }}
REPO=${{ env.ACTION_REPO || github.repository }}
echo "ref=$REF" >>"$GITHUB_OUTPUT"
echo "repo=$REPO" >>"$GITHUB_OUTPUT"
shell: bash
env:
ACTION_REF: ${{ github.action_ref }}
ACTION_REPO: ${{ github.action_repository }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it easier to do it like this?

Suggested change
REF=${{ env.ACTION_REF || github.ref_name }}
REPO=${{ env.ACTION_REPO || github.repository }}
echo "ref=$REF" >>"$GITHUB_OUTPUT"
echo "repo=$REPO" >>"$GITHUB_OUTPUT"
shell: bash
env:
ACTION_REF: ${{ github.action_ref }}
ACTION_REPO: ${{ github.action_repository }}
REF=${{ github.action_ref || github.ref_name }}
REPO=${{ github.action_repository || github.repository }}
echo "ref=${REF}" >>"${GITHUB_OUTPUT}"
echo "repo=${REPO}" >>"${GITHUB_OUTPUT}"
shell: bash

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would think so, but the syntax is needed to work around a limitation of composite actions. See:

actions/runner#2473
github/docs#25336 (comment)

action.yml Outdated
Comment on lines 115 to 131
- name: Set Docker image name and tag
run: |
# Set Docker image name and tag
# if action run was triggered by a pull request to this repo,
# build image from Dockerfile because it has not been pushed to GHCR,
# else pull image from GHCR
if [[ $GITHUB_EVENT_NAME == "pull_request" ]] &&
[[ $GITHUB_REPOSITORY == "pypa/gh-action-pypi-publish" ]]; then
IMAGE="../../../Dockerfile"
else
REF=${{ steps.set-repo-and-ref.outputs.ref }}
REPO=${{ steps.set-repo-and-ref.outputs.repo }}
IMAGE="docker://ghcr.io/$REPO:${REF/'/'/'-'}"
fi
FILE=".github/actions/run-docker-container/action.yml"
sed -i -e "s|{{image}}|$IMAGE|g" "$FILE"
shell: bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was thinking… Why do we need to check in action.yml to Git even if this modifies it anyway? Why not generate it all, then?

Also, I'd use Python instead of Bash here. Then, it'd be possible to have a dict with data and write it as YAML using json.dump() (because JSON is valid YAML, almost always).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work. I've added a Python script that generates the Docker action (9ae1850).

Regarding your related comment:

I think that generating the file is a good idea. It should be possible to write the file without bringing in the PyYAML dependency. But it's not that easy for reading it. Can we make use of yq somehow, and convert YAML to JSON this way, maybe?

I've started by using PyYAML but will think about a way to do this with yq.

user: ${{ inputs.user }}
password: ${{ inputs.password }}
repository-url: ${{ inputs.repository-url || inputs.repository_url }}
packages-dir: ${{ inputs.packages-dir || inputs.packages_dir }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would break the deprecation messages, it seems. Have you checked?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The new inputs don't have defaults, so for example, ${{ inputs.repository-url || inputs.repository_url }} will default to inputs.repository_url and deprecationMessage will be logged.

@webknjaz
Copy link
Member

I'm done with the initial review. More is needed, but I'd rather accept what I can through separate PRs to make this one smaller. And the suggested refactoring could be done in parallel. I think that generating the file is a good idea. It should be possible to write the file without bringing in the PyYAML dependency. But it's not that easy for reading it. Can we make use of yq somehow, and convert YAML to JSON this way, maybe?

@br3ndonland
Copy link
Author

@webknjaz thank you for your detailed review. I've addressed most of your comments so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build&publish the base container to GHCR + point to it from action
2 participants