Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new labels that we can switch over to later #836

Open
consideRatio opened this issue Apr 28, 2024 · 7 comments
Open

Create new labels that we can switch over to later #836

consideRatio opened this issue Apr 28, 2024 · 7 comments

Comments

@consideRatio
Copy link
Member

consideRatio commented Apr 28, 2024

Extracted from #835 (comment):

There's an ongoing discussion about how names (including pod labels) are normalised, and whether this can be changed in a backwards compatible manner:

If we want to do the above this might be a good opportunity to set the new labels to whatever the new normalisation scheme is, and leave the old labels unchanged, avoiding some backward compatibility complexity. It won't avoid all complexity since it's not just labels that are affected.

Extracted from #835 (comment)

I'm pro using this as an opportunity to fix our myriad escaping issues! Some variant of #744 would be pretty amazing.

I don't currently have capacity to push on that though :(


@manics I think I get what you mean now with this might be a good opportunity - you mean that getting these new labels out at the same time as #835, makes us able to switchover with a single breaking change in the future with the same pre-conditions (that users have used kubespawner a while with duplicated sets of labels)?

@manics
Copy link
Member

manics commented Apr 29, 2024

you mean that getting these new labels out at the same time as #835, makes us able to switchover with a single breaking change in the future with the same pre-conditions (that users have used kubespawner a while with duplicated sets of labels)?

Yes, subject to discussion about what we escape and how since it may affect more than just labels. Given this could be a breaking change that affects users I think we should discuss these together, and ensure there's a coherent plan that minimises pain for admins (how do they deal with the breaking changes) and for us as developers (will we maintain multiple code paths temporarily, or indefinitely)?

For many JupyterHub PRs we can merge without resolving all issues since the impact of the PR is limited and can be easily changed, but because the issues discussed here have a big impact on JupyterHub admins I think we should have a coherent plan. It might be that the consensus is that #835 is fine in isolation since the escaping issues will take too much time to fix! But I think it should be a deliberate decision.

@consideRatio
Copy link
Member Author

Given this could be a breaking change that affects users I think we should discuss these together, and ensure there's a coherent plan that minimises pain for admins (how do they deal with the breaking changes) and for us as developers (will we maintain multiple code paths temporarily, or indefinitely)?

I agree on minimizing effort for users and maintainers, and if you think there is a coupling here I'll investigate further.

Tech summary

  1. k8s label values have constraints
    We have and may still be breaking these constraints in kubespawner for hub.jupyter.org/username and hub.jupyter.org/servername -- the k8s api-server refuses resource creation for resources if constraints aren't met.
  2. the k8s label=value component=singleuser-server
    This label=value influences what pods KubeSpawner monitors via its k8s resource reflector - removing/changing this label=value makes KubeSpawner loose track of user server pods which is a breaking change forcing all user servers to be shut down before upgrading. This is why its good to add labels alongside the old component as done in Add modern app.kubernetes.io labels alongside old #835.
  3. matchLabels for a k8s Deployment ...
    ... indicates what pods the Deployment monitors and controls. It is an immutable field forcing re-creations of the Deployment resource if its to be changed. Practically when a change to matchLabels is to be done for a Deployment resource, we would delete the Deployment resource first without deleting pods it controls via kubectl delete --cascade=orphan ..., and then re-create the Deployment resource (helm upgrade).

KubeSpawner labels to consider

  • hub.jupyter.org/username and hub.jupyter.org/servername aren't used by KubeSpawner itself. They are instead only of relevance to for example jupyterhub/grafana-dashboards` or similar.
  • component is used to decide what k8s pods to monitor via the PodReflector

KubeSpawner functions to consider

  • The functions to create/delete k8s pods/pvcs relies on knowledge about a pod/pvc name - not relying on label etc
  • The functions to create/(delete) k8s secrets (for internal ssl) also don't rely on label info, and are automatically deleted via referenced ownerReference (user server pod) being deleted)

My conclusions

  • Third party things maybe breaks if hub.jupyter.org/username or hub.jupyter.org/servername change
    Changing the values of hub.jupyter.org/username and hub.jupyter.org/servername are not breaking kubespawner, z2jh, binderhub, or jupyterhub/grafana-dashboards (https://github.com/jupyterhub/grafana-dashboards/blob/main/docs/tutorials/deploy.md#pre-requisites). Renaming them would be breaking to users of grafana-dashboards and then they would benefit from having a period of time with both old/new names available.
  • Additions of labels are not a problem
    I think adding labels alongside the old should be non-breaking for anything really.
  • Changing the value hub.jupyter.org/username or hub.jupyter.org/servername can be done now or later
    If we want to bundle the change associated with the labels, we need it bundled with the next major release of KubeSpawner/Z2JH rather than for z2jh 4 adding new labels - because that isn't breaking - only removing is.
  • Its nice to do any breaking change now in KubeSpawner to get it flushed out with z2jh 4

My takeaway

I think its good to address #737, #744, #791 now as doing so is breaking for software outside jupyterhub's official software referencing those labels values, and getting a breaking change to land in kubespawner ~now is great as it allows us to flush it into z2jh 4 with other breaking changes demanding users attention.

I see no reason to wait doing #737, #744, #791 until we maybe in the futuer remove component label etc in favor of app.kubernetes.io/component in z2jh 5 or later.

@consideRatio
Copy link
Member Author

consideRatio commented May 4, 2024

It seems #744 is making a change that would influence self.pod_name, self.dns_name, self.secret_name, self.pvc_name, self.namespace, self.working_dir. A change to pod_name requires user servers to be restarted I think, and a change to pvc_name and namespace are probably very breaking changes as they are persistent.

        if self.enable_user_namespaces:
            self.namespace = self._expand_user_properties(self.user_namespace_template)
        self.pod_name = self._expand_user_properties(self.pod_name_template)
        self.dns_name = self.dns_name_template.format(namespace=self.namespace, name=self.pod_name)
        self.secret_name = self._expand_user_properties(self.secret_name_template)
        self.pvc_name = self._expand_user_properties(self.pvc_name_template)
        if self.working_dir:
            self.working_dir = self._expand_user_properties(self.working_dir)

Overall, this makes #744 far more complicated to land than if it just touched the labels =/

@manics
Copy link
Member

manics commented May 7, 2024

My feeling is that breaking changes are fine as long as there's no risk of data loss. Restarting user's pods is disruptive but it's very rare that it's required by a Z2JH release. Kubernetes upgrades require all pods to restart, and Kubernetes upgrades are much more frequent!

Changes to pvc_name and namespace are obviously more problematic, perhaps we should store more information in the state similar to jupyterhub/dockerspawner@05f5433 from jupyterhub/dockerspawner#386 ?

@consideRatio
Copy link
Member Author

Kubernetes upgrades require all pods to restart, and Kubernetes upgrades are much more frequent!

Not at a single point in time though! When upgrading in 2i2c clusters, i rotate out old node pools and rotate in new. K8s upgrades are almost entirely non-disruptive this way.

Changes to pvc_name and namespace are obviously more problematic, perhaps we should store more information in the state similar to jupyterhub/dockerspawner@05f5433 from jupyterhub/dockerspawner#386 ?

Hmmm, a common situation is that the username expansion is used to declare a subPath on a NFS mount as well, so even changing how the username expands can be breaking. I've not looked at the linked PR yet, I'm on mobile atm, but I'm hesitant to try handle the complexity of a transition, but more optimistic on doing a switch where new z2jh deploys get new defaults and old deploys retains previous function unless they opt-in - including then breaking changes.

@consideRatio
Copy link
Member Author

Ah hmm okay so we have built in functionality in Spawner base class to retain state as documented in https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html#spawner-state, nice - but is it reasonable?

So we would then ask "is this a new user-server that hasn't been spawned before - then let's do X instead of Y" for each user server? Hmmm...

I think overall, this is also too complicated and fragile:

  • we would have some users conditionally expanding their username in one way, and others in another way, and anything we haven't developed touching this has to handle both at the same time now
  • we would have to persist data in the jupyterhub database for each user/server combination, but I think it could be breaking if we for example expand username old-school style for the user's default server but not for a new named server - but we couldn't know
  • we would have to persist data in jupyterhubs database, we have additional state that we need to keep track on indefinitively and could cause issues

@consideRatio
Copy link
Member Author

consideRatio commented May 8, 2024

Current take

KubeSpawner

  1. I think the label values should match how we do the username/servername expansion
  2. I think changing the username/servername expansion is breaking things in ways that are hard to determine and test for as it can tie to users setups, and that due to this:
  • a) KubeSpawner shouldn't attempt to operate in a way that does different username/servername expansion on a per-server or per-user basis because it would become too complicated for project maintainers and users to handle well enough
  • b) KubeSpawner should provide a new robuster username/servername expansion that can be opted in to

Z2JH

  1. With a KubeSpawner opt-in available and tested a bit, z2jh should consider trying to make the behavior default for new installations - something that could probably be done somewhat robustly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants