Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When already-expired node is set to "Never Expire" (expiry is NULL), it does not go back to logged-in status. #1851

Open
1 of 2 tasks
benmehlman opened this issue Mar 28, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@benmehlman
Copy link

benmehlman commented Mar 28, 2024

Bug description

Using Tailscale's control server: when a node expires, it remains connected to the control server, although it no longer passes tailnet traffic. The node can be restored to operation by selecting "Disable key expiry" in the Tailscale admin UI.
It will start passing traffic again, without having to re-authenticate or take any other action on the node machine itself.

This does not work on headscale.

Environment

  • OS: Debian 12.4
  • Headscale version: v0.23.0-alpha5
  • Tailscale version: 1.62.0
  • Headscale is behind a (reverse) proxy
    Yes, nginx.. I have to in order to host the ui.. but.. I've been in the discord and nobody really knows much about expiration behavior, I seem to be the only person active there who is really interested in expiration behavior right now.

The reverse proxy seems to be working fine as everything else related to node-server communication is working perfectly and it's been very stable.

  • Headscale runs in a container

To Reproduce

These steps assume OIDC is in use...

  1. In config.yaml, set oidc expiry to a short time so that expiration can be easily observed (eg. "5m"), and restart the service,
  2. Run "tailscale up" on the node with the appropriate parameters to connect to the headscale instance.
  3. Complete OIDC login.
  4. Observe that the node is connected to the tailnet as normal.

On the headscale server:

  1. Wait for the node to expire.
  2. Observe that headscale nodes list indicates that the node is connected but expired, as expected.
  3. Test the node connectivity to confirm that it has stopped passing traffic as expected.
  4. Set the node to "Disable key expiry" by using sqlite3 to execute: UPDATE node SET expiry = NULL WHERE id = the_node_id;
  5. Observe that headscale nodes list indicates that the node is "online" and expired is "no".
  6. Observe that, even after some time is allowed for polling (if necessary), the node does not resume passing traffic, and tailscale status on the node remains "Logged out".

Logs and attachments

netmap_recover_after_expiry.json

@benmehlman benmehlman added the bug Something isn't working label Mar 28, 2024
@kradalby kradalby added this to the v0.23.0 milestone Apr 2, 2024
@kradalby
Copy link
Collaborator

kradalby commented May 1, 2024

So this will not really work since changing the database will not trigger any of the mechanisms that update the clients. I would think that if you change the database and restarts headscale it might work.

I think essentially what we need is a new command set-expiry or something which sets a new expiry and the nodes are appropriately updated.

I'm going to remove this from 0.23.0, it is important, but it is not a regression at should be tackled after.

@kradalby kradalby removed this from the v0.23.0 milestone May 1, 2024
@benmehlman
Copy link
Author

I did try restarting headscale, it didn't cause the node to come back online.. so, there is some other detail that is not quite right.

May I suggest that rather than a separate api for set-expiry, rather implement PATCH so that as new columns are added in the future it would be easy to add them to the API without adding more endpoints?

Also I suggest adding a separate boolean column for "never_expire". This removes the ambiguity when a node which has never authenticated has an expiry = null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants