auto_eject_drop as an alternative to auto_eject_hosts #213

mezzatto · 2014-04-04T00:52:44Z

Background:

modula distribution
using twemproxy in front of redis servers that are acting as databases, not as caches
data is sharded between these redis servers
master / slave configuration
application is 99% reads, almost no writes
client balances the load between the masters and the slaves instances. For every request the client chooses if it goes to the master or to the slave pool in twemproxy (50/50 weight)

If one or more redis server goes offline and I choose the twemproxy pool that has the failing redis server and the key is hashed to this redis server, I have to wait the timeout before I get my reply and try the other twemproxy pool. Every request that goes to this redis server has to wait. This is bad... and I cannot use auto_eject_hosts since it changes the shards / hash ring.

I though about implementing a new config, called auto_eject_drop with values true (current behavior, default) and false (my proposal) that lets you:

when a server fails, it wont decrement the live servers of the continuum (this way the shards wont change)
immediately replies with "ERR timeout" if some key tries to ask a failing redis server

The way to do that is maybe changing req_forward() so that if a faling server got asked, the message will not be enqueued and. I dont know if this is the best solution but is the one that has minimal code change.

Any thoughts?

The text was updated successfully, but these errors were encountered:

A boolean value that controls if auto ejected hosts should be dropped from the hash ring. If set to false, failing hosts will immediately reply timeout. Defaults to true. See twitter#213 for more information

sherman · 2014-12-23T15:46:30Z

Seems that the problem with mget and del commands are still exists :-(
Sometimes have to wait the timeout in case of mget, sometimes response is ERR connection refuse. Can't figure out what does it depend.

Config:

supercluster:
listen: 127.0.0.1:22121
hash: fnv1a_64
distribution: modula
auto_eject_hosts: true
auto_eject_drop: false
redis: true
server_retry_timeout: 2000
server_failure_limit: 1
preconnect: true
timeout: 400
servers:

127.0.0.1:7001:1 node1
127.0.0.1:7002:1 node2

TysonAndre · 2021-07-01T23:09:41Z

Not exactly the same as your use case, but see #608

An upcoming 0.6.0 is planned with the following:

Support for redis sentinel, redis's official way for host failover
A failover pool for memcache - instead of ejecting hosts or attempting to reconnect to failing hosts, send requests for keys corresponding to failing hosts to the failover pool instead of the failing host in the main pool

mezzatto mentioned this issue Apr 4, 2014

New option "auto_eject_drop" #214

Open

TysonAndre changed the title ~~auto_eject_hosts mode design~~ auto_eject_drop as an alternative to auto_eject_hosts Jul 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto_eject_drop as an alternative to auto_eject_hosts #213

auto_eject_drop as an alternative to auto_eject_hosts #213

mezzatto commented Apr 4, 2014

sherman commented Dec 23, 2014

TysonAndre commented Jul 1, 2021

auto_eject_drop as an alternative to auto_eject_hosts #213

auto_eject_drop as an alternative to auto_eject_hosts #213

Comments

mezzatto commented Apr 4, 2014

sherman commented Dec 23, 2014

TysonAndre commented Jul 1, 2021