Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare 0.30.0 release #1731

Open
19 of 20 tasks
tkatila opened this issue May 7, 2024 · 8 comments
Open
19 of 20 tasks

Prepare 0.30.0 release #1731

tkatila opened this issue May 7, 2024 · 8 comments

Comments

@tkatila
Copy link
Contributor

tkatila commented May 7, 2024

Checklist:

  • run validation on main
    • QAT (generic)
    • GNR (IAA, DSA)
    • GNR-D (QAT)
    • FPGA
    • SPR (SGX, QAT, GPU, IAA, DSA)
  • Make sure kube-rbac-proxy is the latest version
  • create release-0.30 branch
  • release branch changes
    • edit default_labels.docker + make dockerfiles
    • make set-version TAG=0.30.0 + commit
    • update publish.yml to create docs for v0.30
  • draft release notes, review
  • publish release
  • main branch changes
    • update base README for supported versions and docs URL
    • update main branch's operator CRs to point to 0.30, also reconciler.go
  • update helm chart: PR
    • Make sure to update CRDs and README
  • update operatorhub.io bundle
@hj-johannes-lee
Copy link
Contributor

hj-johannes-lee commented May 18, 2024

There is an error related to this commit.
https://github.com/k8s-operatorhub/community-operators/actions/runs/9137548241/job/25127696012?pr=4366#step:3:5083

And, it seems that the ci/cd tests in the operatorhub has a specific namespace 'testeupgrade', which may mean that we cannot publish the bundle as it is now.

I tested also locally, and it shows the same error messages.
In addition, when I test removing the contents of the commit above, it runs successfully.

What do we need to do?

@mythi
Copy link
Contributor

mythi commented May 20, 2024

What do we need to do?

Find out what the error is about and plan the fix accordingly. I'm not clear why it fails. Did you check what the test case is about and what we are doing wrong?

@tkatila
Copy link
Contributor Author

tkatila commented May 20, 2024

I wonder if it's some upgrade test where the changed labels causes confusion.

edit: nevermind, apparently I can't read.

The fix that is causing this was related to the operator bundle (or multiples of them) so reverting the fix would just re-introduce the issue. Kinda.

@hj-johannes-lee
Copy link
Contributor

Thanks to the help of @tkatila, i figured out that it is not possible to change the labels from the previous version.

We added one more from the previous version, so it is not possible to upgrade from the previous version.
I can see some similar case (https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/7331).

I found one source that talks about solving this problem.
https://olm.operatorframework.io/docs/troubleshooting/clusterserviceversion/

So, we may need to publish 0.30.0 that does not cause a problem with the addition of new label and then 0.30.1 which would be 'real version' of published operator.

@hj-johannes-lee
Copy link
Contributor

hj-johannes-lee commented May 20, 2024

k8s-operatorhub/community-operators#4375
I can see all tests got passed.
So, I guess, we have two options.

  1. Change the name of the deployment from inteldeviceplugins-controller-manager to something else permenantly
  2. Change the name of the deployment from inteldeviceplugins-controller-manager to something else temporarily and change back with 0.30.1 version.

The reason why I am suggesting the second option is because I do not know if we need to 'keep' the current name inteldeviceplugins-controller-manager.

@hj-johannes-lee
Copy link
Contributor

hj-johannes-lee commented May 21, 2024

@mythi @tkatila Let me know which way you think is better! :)

@tkatila
Copy link
Contributor Author

tkatila commented May 21, 2024

I'm trying to think of a way that would not include bumping up the version number and creating a patch release.

If we update the name permanently, what are the downsides for it? Some upgrade somewhere would result in two copies of the operator?
Are we sure a 0.29.0->0.30.0->0.30.1 upgrade path would work (changin name back and forth)? What if the user upgrades from 0.29.0 to 0.30.1, wouldn't he/she get the same error?

@mythi
Copy link
Contributor

mythi commented May 22, 2024

  1. Change the name of the deployment from inteldeviceplugins-controller-manager to something else permenantly

What happens to the old deployment if you add a new (renamed) one as part of the OLM upgrade?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants