Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

key pool can cause memtier to hang when there are multiple shards and request count is bellow pool size #204

Closed
filipecosta90 opened this issue Feb 6, 2023 · 0 comments · Fixed by #117 · May be fixed by #205
Closed
Assignees

Comments

@filipecosta90
Copy link
Collaborator

this can be easily reproduced with master ( or any previous stable version ) and more than 1 shards scenario on OSS cluster.
with a simple > 1 shard scenario:

$ redis-cli
127.0.0.1:6379> cluster slots
1) 1) (integer) 0
   2) (integer) 5461
   3) 1) "127.0.0.1"
      2) (integer) 6379
      3) "ae7607ffdb3519473d83a9420f3a00c162820a8a"
      4) (empty array)
2) 1) (integer) 5462
   2) (integer) 10923
   3) 1) "127.0.0.1"
      2) (integer) 6381
      3) "ffa17c93a302f721c18377c7256053ecb8175e97"
      4) (empty array)
3) 1) (integer) 10924
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 6383
      3) "eafe2ee085ca92723662dd18661088f70962c197"
      4) (empty array)

you can see that there is no assurance that the 4 requests generated are for the connections that is sending the requests ( given we have 3 connections ).

$ memtier_benchmark  --threads 1 --clients 1 -n 4 --ratio 0:1 --cluster-mode --hide-histogram -D
Writing results to stdout
[RUN #1] Preparing benchmark client...
client.cpp:111: new client 0x55a181879400 successfully set up.
[RUN #1] Launching threads now...
shard_connection.cpp:372: sending cluster slots command.
shard_connection.cpp:425: cluster slot command successful
shard_connection.cpp:587: server 127.0.0.1:6379: GET key=[memtier-1168204]
shard_connection.cpp:587: server 127.0.0.1:6383: GET key=[memtier-6263819]
shard_connection.cpp:438: server 127.0.0.1:6379: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:587: server 127.0.0.1:6379: GET key=[memtier-3771236]
shard_connection.cpp:438: server 127.0.0.1:6383: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:587: server 127.0.0.1:6383: GET key=[memtier-5586315]
shard_connection.cpp:438: server 127.0.0.1:6379: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:438: server 127.0.0.1:6383: handled response (first line): $-1, 0 hits, 1 misses
client.cpp:219: nothing else to do, test is finished.
[RUN #1 100%,   0 secs]  1 threads:           4 ops,    7782 (avg:    7782) ops/sec, 303.99KB/sec (avg: 303.99KB/sec),  0.20 (avg:  0.20) msec latency

this is due to the following codition

m_config->requests > 0 && m_reqs_processed >= m_config->requests

and https://github.com/RedisLabs/memtier_benchmark/blob/master/cluster_client.cpp#L352

// store key for other connection, if queue is not full
        key_index_pool* key_idx_pool = m_key_index_pools[other_conn_id];
        if (key_idx_pool->size() < KEY_INDEX_QUEUE_MAX_SIZE) {
            key_idx_pool->push(*key_index);
            m_reqs_generated++;
        }

This can be solved by avoiding pushing to other pools if we don't have enough requests to fill all pools of all shards.

@filipecosta90 filipecosta90 self-assigned this Feb 6, 2023
@filipecosta90 filipecosta90 linked a pull request Feb 6, 2023 that will close this issue
YaacovHazan added a commit to YaacovHazan/memtier_benchmark that referenced this issue May 17, 2023
In cluster mode, where there are few requests with many shards, some
of the shards might not send any requests.

Currently, the memtier-benchmark is hang, because the connections
to these shards are not disabling their read/write events
YaacovHazan added a commit that referenced this issue May 17, 2023
Co-authored-by: YaacovHazan <yaacov.hazan@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment