Unable to upgrade an API-Platform/FrankenPHP/Mercure Docker Swarm service without downtime #898

toby-griffiths · 2024-04-16T23:12:43Z

I think that this is a Mercure issue, but please correct me if I'm wrong…

We have just deployed an API-Platform based project to a Docker Swarm and it's working nicely, however when we attempt to update the services, the first update attempt seems to always fail, with the following error appearing in the logs…

Error: loading initial config: loading new config: loading http app module: provision http: server srv0: setting up route handlers: route 0: loading handler modules: position 0: loading module 'subroute': provision http.handlers.subroute: setting up subroutes: route 0: loading handler modules: position 4: loading module 'mercure': provision http.handlers.mercure: "bolt:///data/mercure.db?subscriptions=1": invalid transport: timeout

If we re-run the same docker stack update command the existing service appears to stop, and the API goes offline for a brief period while the new service starts up, and then everything works again.

Is this caused by some form of locking on the Mercure data store? Is there a way around this?

I've briefly looked at the High Availability docs today, and how you can build a custom transport, but I'm not very familiar with Go, so would not know where to start with this. Any pointers on this, if it would help resolve this issue would be very much appreciated.

Thanks for all your great work on this project.

The text was updated successfully, but these errors were encountered:

toby-griffiths · 2024-05-15T14:52:24Z

Is anyone able to give me any pointed on this one as we're now approaching a produciton launch and I'd prefer if we didn't gave to do all our deploys out of hours when we can have a brief outage for the update?

Any pointers/thoughts/ideas are very welcome. Thank you.

dunglas · 2024-05-16T14:20:06Z

I guess that Docker starts a new container before stopping the existing one. This is an issue when using the Bolt transport because BoltDB relies on a lock. The first container must release the lock for the second one to take it.

An option is to upgrade to the (paid) on-premise version, which doesn't have this issue because, unlike Bolt, Redis supports concurrent connections.

Another option would be to patch check if Docker sends some signals to the existing container before starting the new one, catch this signal in the Bolt transport, and close the connection to the Bolt DB immediately (that will release the lock).

dunglas · 2024-05-16T14:21:38Z

This issue seems to confirm this theory: influxdata/influxdb#24320

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to upgrade an API-Platform/FrankenPHP/Mercure Docker Swarm service without downtime #898

Unable to upgrade an API-Platform/FrankenPHP/Mercure Docker Swarm service without downtime #898

toby-griffiths commented Apr 16, 2024

toby-griffiths commented May 15, 2024

dunglas commented May 16, 2024

dunglas commented May 16, 2024

Unable to upgrade an API-Platform/FrankenPHP/Mercure Docker Swarm service without downtime #898

Unable to upgrade an API-Platform/FrankenPHP/Mercure Docker Swarm service without downtime #898

Comments

toby-griffiths commented Apr 16, 2024

toby-griffiths commented May 15, 2024

dunglas commented May 16, 2024

dunglas commented May 16, 2024