High availability (HA) #57

micdenny · 2021-09-20T13:53:25Z

First of all thank you for the amazing efforts on this great tool for amqp.

I was wondering what you do in Bloomberg to protect against a possible failure of the system hosting amqpprox. Do you have more than one instance running on different server? If one fails, how do you direct the traffic to another instance? Do you have some sort of clustering over amqpprox? any plan on that?

I was thinking to adopt amqpprox in an enterprise scenario, and I try to have all the single point of failure in HA.

alaric · 2021-09-21T16:54:25Z

Yes, internally we do run multiple instances to provide HA on the proxy instances (and indeed also the brokers as clusters). Items such as the ports and control sockets can be parameterised for when you do need to run multiple independent versions of amqpprox on a single host, but we would also recommend having multiple hosts running amqpprox with the same configuration. We may add SO_REUSEPORT support to allow multiple instances listening on the same port on a single host, but that's likely more useful for scaling rather than HA.

Unfortunately our layer above the amqpprox binary that coordinates its configuration and orchestrates vhost migrations across multiple amqpprox hosts is quite tied to Bloomberg specific infrastructure, so it's unlikely to be open sourced.

As you mention, directing traffic to a set of machines running the amqpprox is an issue, but not one that for us needs to be vhost-specific, so we can use generic service discovery infrastructure via DNS in any client library to find all the amqpprox instances, then any can service the clients. You can think of this as similar to how Consul works for service discovery with health checking. I'd imagine using keepalived or BGP/ECMP would also work just fine to do this at the network layer, similar to how something like HAProxy could be set up. Which approach would work best would depend more on how you do the rest of your infrastructure.

We don't currently have any plans to have the amqpprox instances communicate amongst themselves, as we'd prefer to leave that to a layer above and generally keep amqpprox relatively single purpose.

micdenny · 2021-09-21T17:42:09Z

Thank you very much for the explanation, that's clarifies some of my doubts.

Now I have to understand if using a plain tcp balancer for my purposes just serve the cause, because using amqpprox will add a layer above a balancer, as you said maybe a simple service discovery can be enough, but we don't already have that in place, so my network guy will push me to simply use our tcp balancer to orchestrate the multiple amqpprox instances, so the next question he will ask me is "why use amqpprox and not just our balancer?", and for our purposes probably I don't know the answer right now 😄 because actually I'm just thinking about having a balancer in front of my rabbitmq clusters to better manage nodes maintenance, to avoid having all connections fall into one node.

Amqpprox as far as I understood it's convenient if you want to scale-out per vhost, but I was trying to understand if could also be a better balancer, because actually knows amqp.

alaric self-assigned this Sep 20, 2021

micdenny mentioned this issue Sep 21, 2021

Configuration persistence #58

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High availability (HA) #57

High availability (HA) #57

micdenny commented Sep 20, 2021 •

edited

alaric commented Sep 21, 2021

micdenny commented Sep 21, 2021

High availability (HA) #57

High availability (HA) #57

Comments

micdenny commented Sep 20, 2021 • edited

alaric commented Sep 21, 2021

micdenny commented Sep 21, 2021

micdenny commented Sep 20, 2021 •

edited