Redeploys or restarts never complete if ngrok sessions are limited to 1 #266

jrobsonchase · 2023-07-03T18:43:38Z

What happened

A deploy or rollout restart causes a new replicaset to get created with the new pod spec in parallel with the old one. Once this is marked "Ready", the old one is terminated.

If agent sessions are limited to 1, as with free ngrok accounts today, this will never actually occur, since the ingress controller pod's rediness is contingent on the agent session starting successfully.

There are also no logs to indicate why the new pods get stuck.

What you think should happen instead

Hard to say. Maybe we should detect the "too many sessions" error, and allow the pod to ready up if it persists for a certain amount of time? Forcing the old one to close before the new one can start will result in a blip in uptime, so we still want to wait on the session to fully start if we can, but we still shouldn't block forever in the event that it never will.

How to reproduce

make deploy with a free account is sufficient, since it does a rollout restart to force a re-pull. You'll see two ingress controller pods and two replicasets. The deploy will list one as the "old" and the other as "new". The "new" one's pod will be stuck unready.

The text was updated successfully, but these errors were encountered:

alex-bezek · 2023-07-06T23:12:44Z

Forcing the old one to close before the new one can start will result in a blip in uptime, so we still want to wait on the session to fully start if we can, but we still shouldn't block forever in the event that it never will.

Is it possible to wait for the new session to start before closing the old one if there is a limit of 1 agent session for free accounts? If you are limited to a single active session, it seems we'll always have to tear one down before creating the new one which will always have a small blip of downtime for free accounts.

We technically could make a helm configuration that allows free accounts to specify a Recreate Rollout Strategy instead of a Rolling Update, but the DevX for free accounts to remember to add a specific helm override isn't the greatest.

jrobsonchase · 2023-07-10T17:28:02Z

Is it possible to wait for the new session to start before closing the old one if there is a limit of 1 agent session for free accounts? If you are limited to a single active session, it seems we'll always have to tear one down before creating the new one which will always have a small blip of downtime for free accounts.

That's basically what we're doing already, but there's no way to tell the first session that it's time to shut down while the new one is coming up. The new instance can start, but it'll never get marked Ready until its tunnel session starts, which means the original will never get told to shut down.

We could go ahead and mark it "Ready" without tunnel session readiness, which will shut down the original, and cause the unavoidable blip for free accounts, and a racey blip for accounts with higher limits but only a single controller instance. If there are multiple controller pods, and they don't all go out simultaneously, it should be fine. That's probably the simplest fix.

The other is to check the error that comes back from a failure to start the tunnel session, and go ahead and mark the pod as Ready if it's a limit error.

devpro · 2024-01-09T08:51:16Z

Is it possible to update the Helm chart to be able to override the default spec.strategy.type (RollingUpdate -> Recreate)?

jrobsonchase added bug Something isn't working area/controller Issues dealing with the controller needs-triage Issues that need triage labels Jul 3, 2023

alex-bezek added priority/medium and removed needs-triage Issues that need triage labels Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redeploys or restarts never complete if ngrok sessions are limited to 1 #266

Redeploys or restarts never complete if ngrok sessions are limited to 1 #266

jrobsonchase commented Jul 3, 2023

alex-bezek commented Jul 6, 2023

jrobsonchase commented Jul 10, 2023

devpro commented Jan 9, 2024 •

edited

Redeploys or restarts never complete if ngrok sessions are limited to 1 #266

Redeploys or restarts never complete if ngrok sessions are limited to 1 #266

Comments

jrobsonchase commented Jul 3, 2023

What happened

What you think should happen instead

How to reproduce

alex-bezek commented Jul 6, 2023

jrobsonchase commented Jul 10, 2023

devpro commented Jan 9, 2024 • edited

devpro commented Jan 9, 2024 •

edited