storage/redis: reimplement using scripts #472

mrd0ll4r · 2020-03-13T19:40:20Z

This possibly fixes #471, and also possibly increases performance, but who knows...

storage/redis/peer_store.go

storage/redis/scripts.go

mrd0ll4r · 2020-03-14T17:19:54Z

PTAL

storage/redis/peer_store.go

jzelinskie · 2020-03-18T01:46:18Z

storage/redis/peer_store.go

+
+	// SCAN through all of our swarm keys.
+	for {
+		values, err := redis.Values(conn.Do("SCAN", cursor, "MATCH", "swarm_*", "COUNT", 50))


This is okay, as long as there's one instance of Chihaya. Once there are multiple trying to GC the same redis instance/cluster, this strategy is non-ideal. I think calculating a shard size based on the total size of the database, and then choosing one at random then GCing all of those keys might be better.

On another note -- is there a reason we don't do this part in the script as well to avoid round trips?

SCAN is nondeterministic, and Redis doesn't allow scripts to have write commands (such as deleting stuff) after anything nondeterministic, in order not to break replication. (See here)

I mostly get what you're saying in your first comment, i.e. instead of processing everything with a swarm_ prefix, better pick one of swarm_<single byte> at random and GC that. (The first byte is the largest shard we can get.)
I don't like it, because:

As the largest shards are 1/256th of the whole keyspace, we would need to GC 256 times as often, on average, to get the same rate of garbage collection as before. (or do multiple shards in one GC)

This rate changes with the number of chihaya instances running.

We could try to have a counter, or volatile keys, or some other mechanism of counting the number of active chihaya instances, but this is non-trivial. (This is an interesting problem and I kind of want to work on it, but not for this PR.)

The added randomness is good for distributed systems like this, but I would argue we don't need it: The SCAN command is already nondeterministic, i.e. it depends on some internal state of the redis instance. There are a bunch of properties around what keys are returned by SCAN, which make me think that this is generally a very elegant solution to our problem.

The added randomness is good for distributed systems like this, but I would argue we don't need it: The SCAN command is already nondeterministic, i.e. it depends on some internal state of the redis instance. There are a bunch of properties around what keys are returned by SCAN, which make me think that this is generally a very elegant solution to our problem.

SCAN is nondeterministic because it's implemented as a cursor in a database that isn't MVCC. Thus parallel operations will change values out from underneath it while you're iterating. That's all it means; it still iterates in the same order across clients. I've both read the code and tested this locally to confirm that this is the case. The guarantees listed in that doc are the exact same in our iteration of the keyspace for the memory implementation.

This rate changes with the number of chihaya instances running.

This is already the case -- if I run two instances of chihaya, the garbage collection on each instance is set to run at the same interval, but different offsets based on when the instance was started. If they were both started around the same time, they're basically going to iterate over the whole database one right after the other which would make the second instance functionally useless while also being expensive on the database.

In order to maximize the usefulness of each instance's garbage collection, the goal should be to have a cluster-wide solution to garbage collection. A solution could roughly look like one of two options: elect only one garbage collection process to run across the deployment or devise a strategy for collaboration across instances. If we're clever that collaboration need not require coordination, which is complicated to implement.

In an unrelated project, we have a means of coordination-free work for backfilling database values. You can keep spinning up more workers and the progress will get faster and faster. For this implementation, the requirements are idempotent work and a uniform keyspace indexed by integers that you can use as bounds.

While this is not directly applicable, I believe that thinking about a strategy along these lines could provide a more ideal solution. Thus, I'd rather discuss and explore this train of thought before compromising and doing something like staggering when GC starts or using a global lock to elect the next person to GC.

storage/redis/peer_store.go

jzelinskie · 2021-02-27T18:03:55Z

Hey, I don't think we should leave this blocked on the GC design.
We can always fix that in a follow up to be more efficient.

mrd0ll4r force-pushed the redis-scripts branch from a38c7da to 17c87e7 Compare March 13, 2020 19:43

This was referenced Mar 13, 2020

The optimizations of redis storage #445

Open

infohashes_count goes into negative values #471

Closed

jzelinskie requested changes Mar 13, 2020

View reviewed changes

mrd0ll4r force-pushed the redis-scripts branch from 17c87e7 to e144ac6 Compare March 14, 2020 17:11

jzelinskie requested changes Mar 18, 2020

View reviewed changes

storage/redis: reimplement using scripts

1f0a6f0

mrd0ll4r force-pushed the redis-scripts branch from e144ac6 to 1f0a6f0 Compare March 18, 2020 17:55

jzelinskie added component/storage kind/cleanup labels May 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage/redis: reimplement using scripts #472

storage/redis: reimplement using scripts #472

mrd0ll4r commented Mar 13, 2020

mrd0ll4r commented Mar 14, 2020

jzelinskie Mar 18, 2020

jzelinskie Mar 18, 2020

mrd0ll4r Mar 18, 2020 •

edited

jzelinskie Mar 31, 2020

jzelinskie commented Feb 27, 2021

storage/redis: reimplement using scripts #472

Are you sure you want to change the base?

storage/redis: reimplement using scripts #472

Conversation

mrd0ll4r commented Mar 13, 2020

mrd0ll4r commented Mar 14, 2020

jzelinskie Mar 18, 2020

Choose a reason for hiding this comment

jzelinskie Mar 18, 2020

Choose a reason for hiding this comment

mrd0ll4r Mar 18, 2020 • edited

Choose a reason for hiding this comment

jzelinskie Mar 31, 2020

Choose a reason for hiding this comment

jzelinskie commented Feb 27, 2021

mrd0ll4r Mar 18, 2020 •

edited