Add a new failover option prioritising nodes #1896

aolley · 2020-12-21T04:45:39Z

This makes it possible to define a list of nodes you'd prefer to communicate
with in cluster mode.

This behaves similarly to FAILOVER_DISTRIBUTE, where it randomises the list of
nodes before trying them in sequence - however it takes that random list and
sorts any preferred nodes to the top first. All candidate nodes are still in
the list, but the preferred ones get tried first.

This is extremely helpful for setups where you know which nodes are closer
(i.e. not crossing an AZ boundary).

aolley · 2020-12-21T04:46:21Z

This is for issue #1194.

aolley · 2020-12-21T04:49:30Z

I just noticed you get a segfault if you have distribute mode set but have no preferred nodes set when you go to run a command... (fixed in subsequent commits)

This makes it possible to define a list of nodes you'd prefer to communicate with in cluster mode. This behaves similarly to FAILOVER_DISTRIBUTE, where it randomises the list of nodes before trying them in sequence - however it takes that random list and sorts any preferred nodes to the top first. All candidate nodes are still in the list, but the preferred ones get tried first. This is extremely helpful for setups where you know which nodes are closer (i.e. not crossing an AZ boundary).

Redis can be in a state where a replica will respond MOVED to a slot even if that replica should be able to serve that slot. The why is immaterial - redis spec says you must perform your command on the host:port the MOVED response specifies. If we always sort our preferred nodes higher than the host:port it says we should read from - we end up never reading from it. This shortcuts that by disabling the FAILOVER mode entirely when this situation is encounted - ensuring the command then goes to the primary. This has a slight benefit over FAILOVER_DISTRIBUTE_SLAVES mode - as that will work, but randomly picks a node out of the list (primary+replicas) each time - you just happen to eventually get the primary and have it continue. By switching to NONE - we immediately just talk to the primary.

aolley mentioned this pull request Dec 21, 2020

[Improvement] RedisCluster::OPT_SLAVE_FAILOVER run readonly commands on the selected node #1194

Open

aolley force-pushed the feature/failover_preferred branch 3 times, most recently from 5f8117f to 84f44c6 Compare March 4, 2021 13:18

michael-grunder self-assigned this Mar 26, 2021

aolley added 2 commits September 27, 2021 16:52

aolley force-pushed the feature/failover_preferred branch from 84f44c6 to c004a75 Compare September 27, 2021 07:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new failover option prioritising nodes #1896

Add a new failover option prioritising nodes #1896

aolley commented Dec 21, 2020

aolley commented Dec 21, 2020

aolley commented Dec 21, 2020 •

edited

Add a new failover option prioritising nodes #1896

Are you sure you want to change the base?

Add a new failover option prioritising nodes #1896

Conversation

aolley commented Dec 21, 2020

aolley commented Dec 21, 2020

aolley commented Dec 21, 2020 • edited

aolley commented Dec 21, 2020 •

edited