Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High availability (HA) #57

Open
micdenny opened this issue Sep 20, 2021 · 2 comments
Open

High availability (HA) #57

micdenny opened this issue Sep 20, 2021 · 2 comments
Assignees

Comments

@micdenny
Copy link

micdenny commented Sep 20, 2021

First of all thank you for the amazing efforts on this great tool for amqp.

I was wondering what you do in Bloomberg to protect against a possible failure of the system hosting amqpprox. Do you have more than one instance running on different server? If one fails, how do you direct the traffic to another instance? Do you have some sort of clustering over amqpprox? any plan on that?

I was thinking to adopt amqpprox in an enterprise scenario, and I try to have all the single point of failure in HA.

@alaric alaric self-assigned this Sep 20, 2021
@alaric
Copy link
Contributor

alaric commented Sep 21, 2021

Yes, internally we do run multiple instances to provide HA on the proxy instances (and indeed also the brokers as clusters). Items such as the ports and control sockets can be parameterised for when you do need to run multiple independent versions of amqpprox on a single host, but we would also recommend having multiple hosts running amqpprox with the same configuration. We may add SO_REUSEPORT support to allow multiple instances listening on the same port on a single host, but that's likely more useful for scaling rather than HA.

Unfortunately our layer above the amqpprox binary that coordinates its configuration and orchestrates vhost migrations across multiple amqpprox hosts is quite tied to Bloomberg specific infrastructure, so it's unlikely to be open sourced.

As you mention, directing traffic to a set of machines running the amqpprox is an issue, but not one that for us needs to be vhost-specific, so we can use generic service discovery infrastructure via DNS in any client library to find all the amqpprox instances, then any can service the clients. You can think of this as similar to how Consul works for service discovery with health checking. I'd imagine using keepalived or BGP/ECMP would also work just fine to do this at the network layer, similar to how something like HAProxy could be set up. Which approach would work best would depend more on how you do the rest of your infrastructure.

We don't currently have any plans to have the amqpprox instances communicate amongst themselves, as we'd prefer to leave that to a layer above and generally keep amqpprox relatively single purpose.

@micdenny
Copy link
Author

Thank you very much for the explanation, that's clarifies some of my doubts.

Now I have to understand if using a plain tcp balancer for my purposes just serve the cause, because using amqpprox will add a layer above a balancer, as you said maybe a simple service discovery can be enough, but we don't already have that in place, so my network guy will push me to simply use our tcp balancer to orchestrate the multiple amqpprox instances, so the next question he will ask me is "why use amqpprox and not just our balancer?", and for our purposes probably I don't know the answer right now 😄 because actually I'm just thinking about having a balancer in front of my rabbitmq clusters to better manage nodes maintenance, to avoid having all connections fall into one node.

Amqpprox as far as I understood it's convenient if you want to scale-out per vhost, but I was trying to understand if could also be a better balancer, because actually knows amqp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants