Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify map session management #1931

Merged
merged 25 commits into from May 24, 2024

Conversation

kradalby
Copy link
Collaborator

@kradalby kradalby commented May 8, 2024

This PR removes the complicated session management introduced in #1791 which kept track of the sessions in a map, in addition to the channel already kept track of in the notifier.

Instead of trying to close the mapsession, it will now be replaced by the new one and closed after so all new updates goes to the right place.

The map session serve function is also split into a streaming and a non-streaming version for better readability.

RemoveNode in the notifier will not remove a node if the channel is not matching the one that has been passed (e.g. it has been replaced with a new one).

A new tuning parameter has been added to added to set timeout before the notifier gives up to send an update to a node.

Add a keep alive resetter so we wait with sending keep alives if a node has just received an update.

In addition it adds a bunch of env debug flags that can be set:

  • HEADSCALE_DEBUG_HIGH_CARDINALITY_METRICS: make certain metrics include per node.id, not recommended to use in prod.
  • HEADSCALE_DEBUG_PROFILING_ENABLED: activate tracing
  • HEADSCALE_DEBUG_PROFILING_PATH: where to store traces
  • HEADSCALE_DEBUG_DUMP_CONFIG: calls spew.Dump on the config object startup
  • HEADSCALE_DEBUG_DEADLOCK: enable go-deadlock to dump goroutines if it looks like a deadlock has occured, enabled in integration tests.

Signed-off-by: Kristoffer Dalby kristoffer@tailscale.com

@kradalby kradalby force-pushed the kradalby/close-via-notifier branch from 1979589 to 567b34b Compare May 17, 2024 13:08
@kradalby kradalby changed the title test Simpify map session management May 17, 2024
@kradalby kradalby changed the title Simpify map session management Simplify map session management May 17, 2024
@kradalby kradalby marked this pull request as ready for review May 17, 2024 20:03
@kradalby kradalby requested a review from juanfont as a code owner May 17, 2024 20:03
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
add gauge metrics for waiting for notifier lock
add gauge metrics for waiting for notifier batcher lock
add gauge metrics for current pending updates
add metric and count to String method
count mapresps ended
add metric for tracking mapresp.close returns

add timeout to sendall

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
this commit removes the mapresponse map as it complicated
and caused more problems than it solved.

instead, notifier is now changed to always use the last
connected nodes channel and to ensure it is not remove until
that client is disconnected.

as part of this, offline/route change is a bit more concervative
and only get sent if the channel was removed from the notifier
and not if the clients longpolling session ended (there might be
more than one at the time, particularly if the client is connecting.)

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
@kradalby kradalby force-pushed the kradalby/close-via-notifier branch from bd7d904 to 00aa63c Compare May 20, 2024 15:39
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
@kradalby
Copy link
Collaborator Author

This not also should fix the two test failures we saw on most PRs

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
@kradalby kradalby force-pushed the kradalby/close-via-notifier branch from 3c1b83f to a49755e Compare May 21, 2024 08:48
@kradalby kradalby enabled auto-merge (squash) May 21, 2024 09:40
@casdr
Copy link
Contributor

casdr commented May 21, 2024

This also seems to fix #1942 and #1930. Thank you @kradalby!

@kradalby
Copy link
Collaborator Author

Great, will release a new alpha ones its reviewed and merged.

RUN apt-get update \
&& apt-get install -y dnsutils git iptables ssh ca-certificates \
&& rm -rf /var/lib/apt/lists/*
FROM golang:1.22-alpine AS build-env
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please never use alpine images. Musl C is so broken.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please never use alpine images. Musl C is so broken.

Why in general agree with you, this particular case will reproduce the head image so it's more or less the same as all the other ones which we pull from dockerhub and is built by that file.

So in this case this is fixing the odd one out.

@kradalby kradalby merged commit c8ebbed into juanfont:main May 24, 2024
103 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants