Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services should be ready before consuming messages from the queue #1332

Open
winklerj opened this issue Oct 12, 2021 · 2 comments
Open

Services should be ready before consuming messages from the queue #1332

winklerj opened this issue Oct 12, 2021 · 2 comments

Comments

@winklerj
Copy link
Contributor

Description
There have been issues where a services channel to RabbitMQ has been closed and the queue continues to grow. When restarting the container it will try to consume the messages before it goes to ready status in Kubernetes. This causes Kubernetes to restart the container while it is still draining the queue. The service also needs to acknowledge the messages and prolongs draining the queue because of the restarts. Currently I drain the queue faster by scaling up the number of replicas processing the queue.

The desired functionality is to connect to RabbitMQ but go to ready status once connected, but before draining the queue.

Solution Proposal

  • Start with the audit-log and snapshot-service as this is where I've seen the issue. I suspect it is an issue with other services as well
@BirdHighway
Copy link
Contributor

I think the problem might not just be the order of operations within /services/audit-log/app/index.js, where we have:

mainServer.startListening() // sets up queues and starts draining them
// before
mainServer.setupRoutes() // sets up "/healthcheck" endpoint

I think a large enough number of messages will clog the event loop and prevent the web server from sending responses even if the RabbitMQ connection is set up after the web server starts.

I wonder if we could prevent this by setting a prefetch value for the channel. If only 10 messages can be awaiting acknowledgement at a time, then any incoming GET request to "/healthcheck" will only have to wait for those 10 messages to be processed, not 10,000 or whatever the count in the queue is.

I have not tried this out yet. It's also worth considering what effect this would have on performance.

@winklerj
Copy link
Contributor Author

This issue is because of the RabbitMQ prefetch count setting. It looks like it has been added for the components #1479, but maybe not for all of the services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants