Initial node connection should be influenced by node priority #25

mariano · 2016-03-07T10:54:00Z

The documentation of the HELLO command states that a priority is given for each node in the cluster, and that this priority is a number where lower is better, which means a node is more available.

This can be used to influence the initial node connection.

The text was updated successfully, but these errors were encountered:

Revisor · 2016-03-07T14:04:47Z

Just to recap, the first connection right now occurs as follows:

Disque\Connection\Manager::connect() calls
Manager::findAvailableConnection(), which shuffles all known nodes randomly and tries to connect to the first working one by calling
Disque\Connection\Node\Node::connect(), which connects to the node, authenticates with a password, if any, and says HELLO in
Node::sayHello()
Back in the Manager::connect(), we call
Manager::switchToNode(), which calls
Manager::revealClusterFromHello(). This is the point where we know the cluster for the first time.

In order to switch to the best node at this moment, I would call Manager::switchNodeIfNeeded() at the end of Manager::connect(). If there is a better node according to the currently set node prioritizer strategy, the connection will switch to the best node.

I see a few points that need consideration:

In a short lived program, like a typical PHP request, this could force the request to wait for two connections in Disque, if there's a worse node among the initial batch, even though the node might be quite fine (Can there be a node that is "quite fine"? Disque uses only three priorities right now, 1 - OK, 10 - Possible failure and 100 - Failure).
On the other hand if we the client connects to a failing node, this might also cause problems, eg. with ADDJOB.
If we don't do the initial switch, the connection will switch automatically after the first GETJOB command anyway (more pertinent to long running PHP programs that read many jobs)

In the light of this, I suggest the following:

Connect to the first, random node
Reveal the cluster from the HELLO response
Only switch if the current node is failing (its priority is higher than 10).
If Disque implements priorities between 1 - 9 in the future, don't switch if the current node has a priority lower than 10, that is it is working but may be slower or something.

This could be done by writing a new node prioritizer strategy and using it for the first check, called maybe HealthyNodePrioritizer.

What do you think?

mariano · 2016-03-07T14:22:34Z

I like your suggestion, goes in line with connect asap, but if Disque is telling us the node is no good (maybe it's leaving the cluster), then and only then switch away.

While we modify the connection code we might as well end up adding support for #28 so when a specific command tells us that the node we sent that command to is actually leaving the cluster, then issue a reconnect to another node (which in our case, if we implement priorities as you mention, would simply mean just triggering a reconnect)

Revisor · 2016-03-07T15:17:26Z

I think it is not possible to connect to a leaving node. If the connection, authentication, or HELLO return an error (and thus throw an exception), the node is skipped and we try to connect to the next one.

The initial connection logic is in Disque\Connection\Manager::findAvailableConnection()
The switch connection logic is in Disque\Connection\Manager::switchNodeIfNeeded()

Both use a similar pattern:

foreach nodes as node
    try
        connect to node
    catch
        continue

We however need to support -LEAVING for other situations like you write in #28, that's true.

mariano mentioned this issue Mar 7, 2016

Add support to -LEAVING error in GETJOB and other commands #28

Open

mariano mentioned this issue Mar 7, 2017

Roadmap for 3.0 #39

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial node connection should be influenced by node priority #25

Initial node connection should be influenced by node priority #25

mariano commented Mar 7, 2016

Revisor commented Mar 7, 2016

mariano commented Mar 7, 2016

Revisor commented Mar 7, 2016

Initial node connection should be influenced by node priority #25

Initial node connection should be influenced by node priority #25

Comments

mariano commented Mar 7, 2016

Revisor commented Mar 7, 2016

mariano commented Mar 7, 2016

Revisor commented Mar 7, 2016