Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taking dkron backup for large setups #1320

Open
nikunj-badjatya opened this issue Apr 19, 2023 · 4 comments
Open

Taking dkron backup for large setups #1320

nikunj-badjatya opened this issue Apr 19, 2023 · 4 comments

Comments

@nikunj-badjatya
Copy link

Is your feature request related to a problem? Please describe.
https://gist.github.com/pjz/94f4bd81a0897fd64db44593078e2156
Shows how to take a backup.
Our dkron setup has >100K schedules in it.
When we execute curl like this, it takes minutes to complete the request and output file is also 10s sometimes 100s of MBs.

Describe the solution you'd like
What are other ways to take backup efficiently ? Please advise.

Describe alternatives you've considered
Disk snapshot

Additional context
None.

@nikunj-badjatya
Copy link
Author

nikunj-badjatya commented Apr 19, 2023

cc: @vcastellm , @yvanoers

@nikunj-badjatya
Copy link
Author

Any suggestions anyone on this ?

@vcastellm
Copy link
Member

Hey @nikunj-badjatya high volume use case here, your scripts looks good:

10s sometimes 100s of MBs

what s means here? are you talking about time or space?

Taking a disk snapshot can be a good alternative in this case. Currently there's not other way of taking a backup from Dkron.

  • Have you thought on splitting jobs in several clusters?
  • What's the source of truth for the jobs? how are they being created?

@nikunj-badjatya
Copy link
Author

what s means here? are you talking about time or space?

Space. 100's of MBs.

Taking a disk snapshot can be a good alternative in this case. Currently there's not other way of taking a backup from Dkron.

Okay.

  • Have you thought on splitting jobs in several clusters?

We are running single pod statefulset, backed by PVC, deployed in K8S cluster. There are some 150K schedules in it.
We haven't thought of splitting into several clusters as of now.

  • What's the source of truth for the jobs? how are they being created?

Jobs are created via API. Source of truth is data stored in MongoDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants