Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TS.MGET / TS.MRANGE - Multi-shard command failed. #1553

Open
Skoucail opened this issue Dec 19, 2023 · 4 comments
Open

TS.MGET / TS.MRANGE - Multi-shard command failed. #1553

Skoucail opened this issue Dec 19, 2023 · 4 comments

Comments

@Skoucail
Copy link

When executing the commands TS.MGET/TS.MRANGE I keep getting the error: Multi-shard command failed. This may happen if a shard needs to process too much data. Try to apply strict filters, if possible.

The strange thing is that even if i make my TS.MGET/TS.MRANGE FILTER so specific it should only return 1 time serie (so basicly applying a strict filter) the same error is returned.

I tried with and without executing 'timeseries.REFRESHCLUSTER' before the TS.MGET/TS;MRANGE commands. But getting the same results (error).

Example:

image

image

image

Setup:
6 node redis stack cluster. (3 master, 3 slaves)
Timeseries version v1.10.04
Number of time series: +-800 (split over 3 master nodes)

@LiorKogan
Copy link
Member

LiorKogan commented Dec 19, 2023

You'll get this error when libMR reaches a timeout while waiting for results from all the shards. This means that at least one shard needs to process too much data or there is some communication problem/slowdown. It can also happen if one of the shards crashes or is otherwise not available.

I'm not sure why it is happening in your specific case. MGET shouldn't generate a large reply for less than 800 time series. You are also not using too many labels, so the processing is expected to be fast.

Unrelated: it seems that you are using a label named DUPLICATE_POLICY. Is it intentional?

@tezc
Copy link
Collaborator

tezc commented Dec 19, 2023

Do you have access to redis log files? If so, we might see some log lines that indicate the error.

@Skoucail
Copy link
Author

Update:

It seems i'm only getting the error when sending the command to SLAVE nodes.
image

The nodes run in docker. So i have access to the log files.
But it basicly is saying the same as LiorKogan guessed.

Logging from the node running on port 6383

9:S 19 Dec 2023 12:05:47.040 * <timeseries> Got cluster refresh command
9:S 19 Dec 2023 12:05:56.659 # <timeseries> message was not sent because status is not connected
9:S 19 Dec 2023 12:05:56.660 # <timeseries> message was not sent because status is not connected
9:S 19 Dec 2023 12:05:56.660 # <timeseries> message was not sent because status is not connected
9:S 19 Dec 2023 12:05:56.660 * <timeseries> connected : xxx.xxx.xxx.xxx:6381, status = 0
9:S 19 Dec 2023 12:05:56.660 * <timeseries> connected : xxx.xxx.xxx.xxx:6379, status = 0
9:S 19 Dec 2023 12:05:56.660 * <timeseries> connected : xxx.xxx.xxx.xxx:6380, status = 0
9:S 19 Dec 2023 12:06:01.663 # <timeseries> got libmr error:
9:S 19 Dec 2023 12:06:01.663 # <timeseries> execution max idle reached
9:S 19 Dec 2023 12:11:03.273 # <timeseries> got libmr error:
9:S 19 Dec 2023 12:11:03.273 # <timeseries> execution max idle reached

@LiorKogan
Copy link
Member

@MeirShpilraien is it possible to use libMR with slave nodes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants