Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RIOT not working when migrating data from 1 redis cluster to other #127

Open
iamPrakhar opened this issue Aug 10, 2023 · 12 comments
Open

RIOT not working when migrating data from 1 redis cluster to other #127

iamPrakhar opened this issue Aug 10, 2023 · 12 comments

Comments

@iamPrakhar
Copy link

Hi,

I am not sure if riot supports this but i was trying to migrate 1 redis cluster to a different redis cluster using the below command.
Command:

root@migration-riot-redis-cluster-migration:/home/# ./riot-redis/bin/riot --info --timeout=600 -h 10.x.x.x -p 6379 -a '123456' -c -n 0 replicate --timeout=600 -h 10.x.x.x -p 6379 -a '12345678'  -c -n 0 --mode live

Its able to migrate data at the first time but its failing on live migration, is that expected in RIOT ?

Executing step: [live]
Executing step: [scan]
Scanning   0% [                                                                                                                                                                             ]  0/13 (0:00:00 / ?) ?/s
Listening  ? % [=                                       ] 0/? (0:00:00 / ?) ?/sEncountered an error executing step live in job replicate
Listening  ? % [=                                       ] 0/? (0:00:00 / ?) ?/s
Step: [live] executed in 303ms
Job: [SimpleJob: [name=scan-reader]] launched with the following parameters: [{}]
Executing step: [scan-reader]
Step: [scan-reader] executed in 292ms
Scanning 100% [=================================] 13/13 (0:00:00 / 0:00:00) ?/s{}] and the following status: [COMPLETED] in 310ms
Step: [scan] executed in 861ms
Job: [FlowJob: [name=replicate]] completed with the following parameters: [{}] and the following status: [FAILED] in 900ms
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
@iamPrakhar
Copy link
Author

iamPrakhar commented Aug 10, 2023

@jruaux can you please help me on this , if i am doing something wrong in the command or is it the behaviour in redis RIOT.
My source redis cluster version is 6.0.8 and destination redis cluster version is 6.2.6.
My understanding was that single riot process will be able to read data from all 3 masters and will be able to push that data to destination redis cluster.

@szy345
Copy link

szy345 commented Aug 24, 2023

In live mode, the Source of Redis nodes must modify this config. did you ?
config set notify-keyspace-events KA

@szy345
Copy link

szy345 commented Aug 24, 2023

May be ,you can try this args, replicate --type ds

@iamPrakhar
Copy link
Author

In live mode, the Source of Redis nodes must modify this config. did you ? config set notify-keyspace-events KA

i did set on only 1 of the master of the cluster, that should be fine right ?
Or i need to set on all the 3 masters ?
Even if i set on just 1 master , i am able to get all data to destination cluster but live mode is not working

@iamPrakhar
Copy link
Author

@szy345 does redis riot doesn't work with redis cluster migration ?

@szy345
Copy link

szy345 commented Aug 24, 2023

That parameter is not what you understand。You shoud modify all of the redis nodes which is master

@iamPrakhar
Copy link
Author

i did set this config set notify-keyspace-events KA on all 3 master but still live replication fails as soon as i add one new key on the source master node.
Error log:

Listening  ? % [ =                                     ] 1/? (0:00:03 / ?) .3/st not served
Step: [live] executed in 3s661ms
Job: [FlowJob: [name=replicate]] completed with the following parameters: [{}] and the following status: [FAILED] in 3s713ms
Parent JobExecution is stopped, so passing message on to StepExecution
Step interrupted through StepExecution
Encountered interruption executing step live-reader in job live-reader : Job interrupted status detected.
Step: [live-reader] executed in 3s363ms
Parent JobExecution is stopped, so passing message on to StepExecution
Exception while closing step execution resources in step live-reader in job live-reader
Encountered interruption executing job: Job interrupted by step execution
Job: [SimpleJob: [name=live-reader]] completed with the following parameters: [{}] and the following status: [STOPPED] in 3s382ms

@szy345
Copy link

szy345 commented Aug 24, 2023

Are both the source and destination sides self built clusters? Or Alibaba Redis Cloud services? Or others?

@iamPrakhar
Copy link
Author

self build clusters

@iamPrakhar
Copy link
Author

actually previously i missed to add -c in the riot command now that i added that, replication at the first time runs fine. But because i am running live replication as soon as i add a new key the riot fails with this error

Listening  ? % [ =                                     ] 1/? (0:00:04 / ?) .2/sartition for slot 866.
Step: [live] executed in 4s940ms
Job: [FlowJob: [name=replicate]] completed with the following parameters: [{}] and the following status: [FAILED] in 4s972ms
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Connection is already closed
Parent JobExecution is stopped, so passing message on to StepExecution
Step interrupted through StepExecution
Encountered interruption executing step live-reader in job live-reader : Job interrupted status detected.
Step: [live-reader] executed in 4s633ms
Parent JobExecution is stopped, so passing message on to StepExecution
Exception while closing step execution resources in step live-reader in job live-reader
Encountered interruption executing job: Job interrupted by step execution

@szy345
Copy link

szy345 commented Aug 24, 2023

I test it,and I'm ok
In my case ,No pwd, and versions are same. I use docker .
The command for example:
docker run -d --name=riot fieldengineering/riot -h 172.31.0.22 -p 6382 --cluster replicate -h 172.31.0.22 -p 6385 --cluster --mode live --progress=log

The logs :

Listening ? % [= ] 0/? (0:00:00 / ?) ?/s
Scanning 0% [ ] 0/5 (0:00:00 / ?) ?/s
Scanning 100% [====================================] 5/5 (0:00:00 / 0:00:00) ?/s
Scanning 100% [====================================] 5/5 (0:00:00 / 0:00:00) ?/s

Listening ? % [ = ] 1/? (0:00:36 / ?) .0/s | 0 queued # THis one is the new key

The Keys on both sides are same。

@iamPrakhar
Copy link
Author

iamPrakhar commented Aug 24, 2023

My command looks like this

 ./latest/riot-3.1.5/bin/riot --debug --info  --timeout=600 -h 10.x.x.x -p 6379 -c  replicate  --timeout=600 -h 10.x.x.x -p 6379  -a '123456' -c  --mode live

And the logs are

Scanning   0% [                                         ] 0/4 (0:00:00 / ?) ?/s15:14:25.664 INFO org.springframework.batch.core.launch.support.SimpleJobLauncher	: Job: [SimpleJob: [name=live-reader]] launched with the following parameters: [{}]
15:14:25.687 INFO org.springframework.batch.core.launch.support.SimpleJobLauncher	: Job: [SimpleJob: [name=scan-reader]] launched with the following parameters: [{}]
15:14:25.693 INFO org.springframework.batch.core.job.SimpleStepHandler	: Executing step: [live-reader]
15:14:25.700 INFO org.springframework.batch.core.job.SimpleStepHandler	: Executing step: [scan-reader]
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by io.lettuce.core.internal.DefaultMethods$MethodHandleLookup (file:/home/gojek/latest/riot-3.1.5/lib/lettuce-core-6.2.6.RELEASE.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class)
WARNING: Please consider reporting this to the maintainers of io.lettuce.core.internal.DefaultMethods$MethodHandleLookup
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
15:14:26.177 INFO org.springframework.batch.core.step.AbstractStep	: Step: [scan-reader] executed in 459ms
15:14:26.194 INFO org.springframework.batch.core.launch.support.SimpleJobLauncher	: Job: [SimpleJob: [name=scan-reader]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 503ms
15:14:26.334 SEVERE org.springframework.batch.core.step.AbstractStep	: Encountered an error executing step scan in job replicate
org.springframework.retry.ExhaustedRetryException: Retry exhausted after last attempt in recovery path, but exception is not skippable.; nested exception is java.util.concurrent.ExecutionException: io.lettuce.core.cluster.PartitionSelectorException: Cannot determine a partition for slot 11613.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants