-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add initial container file and github build action #54575
base: main
Are you sure you want to change the base?
Conversation
919c509
to
13f476f
Compare
Note: the
|
container/Containerfile
Outdated
# Copr repos | ||
# TODO: afaict, scikit is for mgr-diskpredition-local, asyncssh for cephadm, but why do we | ||
# encode these as deps only in container instead of including them as a vendored | ||
# submodule in the ceph repo or something to be included with ceph packages? | ||
# @tchaikov @torchiaf @adk3798 @ktdreyer? | ||
# ref: https://github.com/ceph/ceph-container/pull/1821 | ||
# note: scikit-learn packages are not included in ubi8 in ceph-container | ||
RUN \ | ||
dnf install -y --setopt=install_weak_deps=False dnf-plugins-core && \ | ||
dnf copr enable -y tchaikov/python-scikit-learn && \ | ||
dnf copr enable -y tchaikov/python3-asyncssh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For asyncssh at least, the cephadm mgr module needs it. I think we couldn't find any packaging of it in official repos so Kefu had this copr repo set up. We should maybe host this under the copr of somebody more involved in the project at this point, but we still need it somewhere.
Edit - didn't read the full comment first.
I assume keeping it in a copr repo is just easier than having it as a submodule, but I didn't really look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for weighing in. It's not a big issue, but it seemed worth asking since I'm not sure how hard it is to set up and maintain copr repos. Regardless, it'll be something to changed later down the line if changing it makes sense.
container/Containerfile
Outdated
# General | ||
PACKAGES="ca-certificates" && \ | ||
# Ceph | ||
# TODO: I removed these packages that were included in ganesha packages: | ||
# gcc, lua-devel, luarocks | ||
# They are listed as BuildRequires in ceph.spec.in, so I assume they are unnecessary | ||
# @mattbenjamin please advise about whether these are necessary for runtime | ||
PACKAGES="$PACKAGES \ | ||
ceph-common${CEPH_PACKAGE_VERSION} \ | ||
ceph-exporter${CEPH_PACKAGE_VERSION} \ | ||
ceph-grafana-dashboards${CEPH_PACKAGE_VERSION} \ | ||
ceph-immutable-object-cache${CEPH_PACKAGE_VERSION} \ | ||
ceph-mds${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr-cephadm${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr-dashboard${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr-diskprediction-local${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr-k8sevents${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr-rook${CEPH_PACKAGE_VERSION} \ | ||
ceph-mgr${CEPH_PACKAGE_VERSION} \ | ||
ceph-mon${CEPH_PACKAGE_VERSION} \ | ||
ceph-osd${CEPH_PACKAGE_VERSION} \ | ||
ceph-radosgw${CEPH_PACKAGE_VERSION} \ | ||
ceph-volume${CEPH_PACKAGE_VERSION} \ | ||
cephfs-mirror${CEPH_PACKAGE_VERSION} \ | ||
cephfs-top${CEPH_PACKAGE_VERSION} \ | ||
kmod \ | ||
libradosstriper1${CEPH_PACKAGE_VERSION} \ | ||
rbd-mirror${CEPH_PACKAGE_VERSION} \ | ||
$(cat crimson-packages.txt) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattbenjamin could you advise if gcc, lua-devel, or luarocks packages are needed in the ceph container image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added those back in here. Left gcc
removed for now.
# Ceph-CSI | ||
# TODO: coordinate with @Madhu-1 to have Ceph-CSI install these itself if unused by ceph | ||
# @adk3798 does cephadm use these? | ||
PACKAGES="$PACKAGES \ | ||
attr \ | ||
ceph-fuse${CEPH_PACKAGE_VERSION} \ | ||
rbd-nbd${CEPH_PACKAGE_VERSION} \ | ||
" && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Madhu-1 @adk3798 let me know if you have thoughts around this. It seems to me like keeping the ceph-fuse and rbd-nbd drivers in the ceph image makes sense, even for non-CSI images. But if you have differing thoughts, let's discuss.
Also, @Madhu-1 is attr
still needed, and if so, do you think it's broadly needed or only for Ceph-CSI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cephadm doesn't use these directly at least. Can't say with certainty for any of the daemons cephadm deploys that use the ceph container image though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving these for now.
container/Containerfile
Outdated
# TODO: Per discussion, sg3_utils should be moved to ceph.spec.in with libstoragemgmt; @ktdreyer | ||
# ref: https://github.com/ceph/ceph-container/pull/2013#issuecomment-1248606472 | ||
PACKAGES="$PACKAGES \ | ||
gdisk \ | ||
hostname \ | ||
procps-ng \ | ||
sg3_utils \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktdreyer FYI, it seems like there will be a follow-up after this to mark sg3_utils as an RPM "Requires".
container/Containerfile
Outdated
|
||
# TODO: Is this still used? Can we get rid of this? @adk3798 do you know? | ||
# ref: https://github.com/ceph/ceph-container/pull/1604 | ||
# Is a ceph container ? | ||
LABEL ceph="True" | ||
|
||
# TODO: This doesn't seem to be used by rook or ceph. Can we get rid of this? @guits do you know? | ||
# ref: https://github.com/ceph/ceph-container/pull/1969 | ||
ENV I_AM_IN_A_CONTAINER 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cephadm doesn't use these directly. It might link somehow to the CEPH_USE_RANDOM_NONCE stuff though. Cephadm's use of that was just recently removed in #50344
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a search of the ceph repo, I didn't find any references to I_AM_IN_A_CONTAINER, but I'm not sure if there could be usages I'd miss by searching like that.
Would @jdurgin know? Or maybe know who I should be asking to find out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might have been just for humans and/or custom scripting in the container. It's good to be able to query, although maybe there's some more-standard way?...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are new org.opencontainers
standardized labels added now that should be good for querying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a process running inside a container query the container labels?
13f476f
to
c2dbb47
Compare
tags: ${{ steps.container-build.outputs.tags }} | ||
registry: quay.io/ceph/ceph-staging | ||
username: ceph+ceph_staging_ci | ||
password: ${{ secrets.CEPH_STAGING_QUAY_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guits this variable seems to be empty, causing this build-push to fail. Is this token misspelled maybe, or is it not created as an "Actions" secret in the repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's well an action secret CEPH_STAGING_QUAY_TOKEN
, i've just reset its value to make sure.
c2dbb47
to
f216c47
Compare
# CLEAN UP! | ||
RUN set -x && \ | ||
dnf clean all && \ | ||
rm -rf /var/cache/dnf/* && \ | ||
rm -rf /var/lib/dnf/* && \ | ||
rm -f /var/lib/rpm/__db* && \ | ||
# remove unnecessary files with big impact | ||
rm -rf /etc/selinux /usr/share/{doc,man,selinux} && \ | ||
# don't keep compiled python binaries | ||
find / -xdev \( -name "*.pyc" -o -name "*.pyo" \) -delete | ||
|
||
# Verify that the packages installed haven't been accidentally cleaned, then | ||
# clean the package list and re-clean unnecessary RPM database files | ||
RUN rpm -q $(cat packages.txt) && rm -f /var/lib/rpm/__db* && rm -f *packages.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revising the clean up from ceph-container, I am seeing an ARM image that is 1.07GB on my machine, versus 1.25GB to 1.28GB for other quay.io/ceph
images I've pulled.
Of note, the parens in the find ... -delete
command fix a bug where *.pyc
files weren't being deleted before. Plus removing unnecessary DNF and RPM database files.
I decided to leave bash completion files (which have changed locations apparently) in the image because they are less than 1MB in total.
b0861b2
to
b45435e
Compare
the squid release will be dropping support for centos 8, and there's a need a build containers based on centos 9. there's been some discussions around this in #53901 and elsewhere on slack. if we're starting from scratch here, it would be wonderful if we could start with centos 9 instead |
I agree that it would be nice, but I (or someone) would need to spend time making sure all the el9 locations are ready. Ideally quincy, reef, squid, and main all have el9 builds, and I'm not sure that's true today or not. Can we at least get this merged and then convert to el9 later? This PR is clean and good to go from my end, but I'm waiting to hear back from @guits about checking the secret so that the quay login works. [update] Just realized that this is probably not a configuration issue, but a security thing. GitHub doesn't allow secrets to be used from pull requests so that users can't "exfiltrate" secrets by making arbitrary PRs. This means that the GH action is good for builds from committed PRs, but not viable for building/publishing images related to pull requests. I guess the shaman system would be good for that, and it seems like something is already set up to build the quay.io/ceph-ci images like that. @dmick does that sound right? |
a98c31c
to
3361853
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not viable for building/publishing images related to pull requests
Yes, that's dangerous in general to run privileged code automatically on anyone's forked PR.
I think what you have here is fine and we should merge it.
this has an approval. does it need any further review/testing to merge? afaik nothing depends on this yet so there's little risk to merging what are the next steps? |
"This means that the GH action is good for builds from committed PRs, but not viable for building/publishing images related to pull requests. I guess the shaman system would be good for that, and it seems like something is already set up to build the quay.io/ceph-ci images like that. @dmick does that sound right?" I can't speak to the GH actions; I don't understand anything about the context they run in, or how one might possibly handle secrets safely. I know that things like those secrets are stored in Jenkins, which is what does the CI builds (shaman and chacra are just storage/db for the build results). I do want to point out however that we generally don't push CI images to quay.io, but rather our own quay.ceph.io (quay.io has usage limits and we do a lot of builds). |
This is only an initial implementation. It is not expected to work for any branch except for Ceph 'main'. The build action is also disabled for any pull request except ones that modify the container build files themselves to make sure that any issues do not cause confusion with ongoing or future feature PRs. This PR focuses on creating a good starting point for a "Containerfile" definition, tooling for building multi-architecture images using it, and pushing the image to a temporary repository: quay.io/ceph/ceph-staging. There are many TODO items noted in the files that should be followed up on in future PRs. Signed-off-by: Blaine Gardner <blaine.gardner@ibm.com>
2d46335
to
9b5837b
Compare
9b5837b
to
6c8afc2
Compare
RUN \ | ||
dnf install -y --setopt=install_weak_deps=False dnf-plugins-core && \ | ||
dnf copr enable -y tchaikov/python-scikit-learn && \ | ||
dnf copr enable -y tchaikov/python3-asyncssh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the tchaikov/python3-asyncssh copr here (and the corresponding mention in the comment above). The python-asyncssh package is in epel8 and epel9 now: https://src.fedoraproject.org/rpms/python-asyncssh
In fact the EPEL 9 version will have an important bug fix (rhbz#2255043 / CVE-2023-48795) soon.
Use of cephadm requires a suitable container image to be available; upstream's are based on centos. This is an attempt at a minimal set for our intended purposes (i.e. no ganesha / mds etc.) Based on upstream work in ceph/ceph#54575 which in turn is a move to simplify how upstream builds their containers. This is for the PoC ceph/reef multisite cluster, codenamed "apus". Bug: T279621 Change-Id: I256acec29b5c50eff266690e15db066470038aa7
I've been working on github.com/dmick/ceph-buildah which uses the buildah utility to build the containers. The huge advantage is that it allows a series of shell commands to make incremental changes to a working container, and then commits the final result, so all the conditional logic can be coded in the toplevel shell command, which is a huge help at bug diagnosis. I think I very much want to go this direction rather than a Dockerfile for the revamp. I've been building containers of centos8 and centos9 x86 and arm and having some luck; I want to do that formally and run some formal container-based tests (probably the rbd iscsi tests to catch "is the iscsi inclusion good as well), and then I think I'd like to propose integrating that into ceph.git rather than this. |
Red Hat's build system (OSBS) input is a single |
For debugging, I like to use the |
oh well that might be a good argument for dockerfiles I guess. crap. Maybe modern dockerfiles are less painful to do this ridiculous handstanding in. |
|
||
# Other labels are set automatically by container/build github action | ||
# See: https://github.com/opencontainers/image-spec/blob/main/annotations.md | ||
LABEL org.opencontainers.image.authors="Guillaume Abrioux <gabrioux@redhat.com>" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guits are you the maintainer of this image?
@dmick I was pointed at this conversation during the most recent infra meeting. I am also interested in the topic and am happy to discuss any challenges you are hitting wrt building images: esp. wrt "this ridiculous handstanding" might entail :-) I also am a fan of buildah, but as Ken notes it's not so widespread and more niche than Dockerfiles. |
Mostly it's "if somehow we end up with another 200-line continued bash statement I will descend upon the author with a mighty and terrible force" but I think multiple RUN statements and --squash are the way out of that horror. ARG and short conditional bash are probably maintainable/usable. |
lol :-)
That seems fine. I also dislike having a bunch of logic on a single RUN line. If the logic is necessary I'll often tend to farm the work out to a real script outside the {Dock,Contain}erfile. |
In our project we've run into several problems around CentOS Stream, causing us to look for Debian based Ceph container images. As we could not find suitable ones, we are now looking at building them ourselves and would like to use the code from this PR as base instead of the one based on Make (I arrived here through issue 2171). So I would like to ask what state this PR is in? I see some open questions / work-in-progress-notes while I do not find anything which, at least to me, looks like testing, e.g. compared with the previous target test.staging (though I only just started reading through that code, so no idea what it actually does/tests, yet). |
More work is needed; I don't know how far this is from final form yet. But container building isn't particularly magic, and if you end up at a slightly different place than we do (besides the obvious 'based on Debian' thing, which will involve a fair bit of package name munging as I'm sure you know), as long as you have the dependencies you need it'll probably be fine. I have a separate repo I've been working in, dmick/container.git, that might or might not be a better starting place. |
@@ -0,0 +1,213 @@ | |||
FROM quay.io/centos/centos:stream8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better use :stream9
as CentOS 8 Stream is EOL at the end of this month.
This is only an initial implementation. It is not expected to work for any branch except for Ceph 'main'. The build action is also disabled for any pull request except ones that modify the container build files themselves to make sure that any issues do not cause confusion with ongoing or future feature PRs.
This PR focuses on creating a good starting point for a "Containerfile" definition, tooling for building multi-architecture images using it, and pushing the image to a temporary repository: quay.io/ceph/ceph-staging.
There are many TODO items noted in the files that should be followed up on in future PRs.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
x
between the brackets:[x]
. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e