You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
There have been issues where a services channel to RabbitMQ has been closed and the queue continues to grow. When restarting the container it will try to consume the messages before it goes to ready status in Kubernetes. This causes Kubernetes to restart the container while it is still draining the queue. The service also needs to acknowledge the messages and prolongs draining the queue because of the restarts. Currently I drain the queue faster by scaling up the number of replicas processing the queue.
The desired functionality is to connect to RabbitMQ but go to ready status once connected, but before draining the queue.
Solution Proposal
Start with the audit-log and snapshot-service as this is where I've seen the issue. I suspect it is an issue with other services as well
The text was updated successfully, but these errors were encountered:
mainServer.startListening()// sets up queues and starts draining them// beforemainServer.setupRoutes()// sets up "/healthcheck" endpoint
I think a large enough number of messages will clog the event loop and prevent the web server from sending responses even if the RabbitMQ connection is set up after the web server starts.
I wonder if we could prevent this by setting a prefetch value for the channel. If only 10 messages can be awaiting acknowledgement at a time, then any incoming GET request to "/healthcheck" will only have to wait for those 10 messages to be processed, not 10,000 or whatever the count in the queue is.
I have not tried this out yet. It's also worth considering what effect this would have on performance.
This issue is because of the RabbitMQ prefetch count setting. It looks like it has been added for the components #1479, but maybe not for all of the services.
Description
There have been issues where a services channel to RabbitMQ has been closed and the queue continues to grow. When restarting the container it will try to consume the messages before it goes to ready status in Kubernetes. This causes Kubernetes to restart the container while it is still draining the queue. The service also needs to acknowledge the messages and prolongs draining the queue because of the restarts. Currently I drain the queue faster by scaling up the number of replicas processing the queue.
The desired functionality is to connect to RabbitMQ but go to ready status once connected, but before draining the queue.
Solution Proposal
The text was updated successfully, but these errors were encountered: