Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MirrorMaker migration documentation #12

Open
OneCricketeer opened this issue Oct 8, 2018 · 7 comments
Open

MirrorMaker migration documentation #12

OneCricketeer opened this issue Oct 8, 2018 · 7 comments
Assignees

Comments

@OneCricketeer
Copy link

OneCricketeer commented Oct 8, 2018

Regarding the Medium post

Mirus completely replaced Mirror Maker across all production data-centers at Salesforce in April 2018. Since then our data volumes have continued to grow.

For those who are running mirrormaker and have an active consumer group offset for their data and would prefer not to have duplicates after starting Mirus, is there a migration documentation available, or run-book that Salesforce applied for replacement?

@pdavidson100
Copy link
Contributor

No documentation available yet, but I will put something together based on our experience at Salesforce.

@mtrienis
Copy link

+1 :-)

@pdavidson100
Copy link
Contributor

@mtrienis Still on my todo list. The short version is that we shut down Mirror Maker, grabbed the Mirror Maker offsets using kafka-consumer-groups.sh , then used bin/mirus-offset-tool.sh with the --reset-offsets and --from-file flags to initialize the Mirus connector offsets. Then, when Mirus started it was able to pick up where Mirror Maker left off with no duplicates.

For the first few clusters we actually left Mirror Maker running in parallel for a few minutes, and accepted the duplicates, just to guarantee everything was running as expected. We still used mirus-offset-tool.sh to initialize our offsets to avoid a flood of duplicates.

@pdavidson100 pdavidson100 self-assigned this Nov 26, 2018
@OneCricketeer
Copy link
Author

Idea:
Could MirusOffsetTool be extended to capture the offset listing functionality of ConsumerGroupCommand so that two scripts wouldn't be needed?

@Hari4AMQ
Copy link

Hari4AMQ commented Jun 10, 2019

@pdavidson100 @Cricket007 Can please share any sample file or format of the file that we supply to MirusOffsetTool with the flag --from-file for resetting offsets?
I'm getting error'ed out with not a valid Long value exception when I try to reset offsets.

@pdavidson100
Copy link
Contributor

@Hari4AMQ The --from-file format is identical to the output format generated by --describe, and supports both CSV and JSON (recommended for setting offsets). For example, if you're setting offsets for a 4 partition topic to 100, then the file format might look like this:

{"connectorId":"connector-id","topic":"topic-name","partition":0,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":1,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":2,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":3,"offset":100}

@dalassi1
Copy link
Contributor

dalassi1 commented Jun 10, 2019

As @pdavidson100 mentioned, you should use the --describe option first and then edit the output file to the offsets needed of the partitions you want. This command is what I would use to get the offsets for topic t1:

bin/mirus-offset-tool.sh --properties-file config/<worker.properties> --describe  --format json | grep "\"topic\":\"t1\"" > t1-offsets.json

then edit the file t1-offsets.json with the desired offsets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants