Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return code question/request #8061

Open
zaphod80013 opened this issue Jan 28, 2024 · 1 comment
Open

Return code question/request #8061

zaphod80013 opened this issue Jan 28, 2024 · 1 comment

Comments

@zaphod80013
Copy link

This may already exist but I've not found anything in the quick searches I've done.

I've just setup a Borgbase account for offsite backups. I've looked at Borgmatic & Vorta, neither fit the way I currently create backups. I'm using bash scripts under cron that create archives with names of the form YYYYMMDD-HHMMSS within the repository. For local backups to NAS this works well. My home directory is currently just over 1TB (most of which is media files). My scripts are failing on dropped connection (most but not all are remote closed connection). Running locally I simply abort the backup and wait for the next cron (which creates a new archive, abandoning any checkpoint) and relies on prune/compact to cleanup the failed backups. For remote backups however this isn't working as I never manage to complete an initial backup.

I've been manually running the initial backup since 17:30MST on Jan 12th 2024 (15 days as of time of posting) of of a LVM snapshot and it is still only about 70% complete.

Is there / Could you add a return code that is unique to network failures? When the connection drops I get a Python exception but the bash return code of 2 seems to be a generic 'create failed' error. As I have code in place to detect backup in progress on a given Repo being able to explicitly detect a network failure in a while loop to resume the current backup would be a huge benefit.

Over the course of this manual backup my snapshot has grown to 600Gb requiring I backup and delete another LVM partition to create space for the snapshot to grow so I can complete the backup, this approach is untenable long term.

I tried generating the archive name in Borgmatic but the was no way (that I could find) to pass it into pre/post script to create/delete the LVM snapshot, I also have btrfs over LVM partition which require I create the snapshot at the btrfs rather than the LVM level. My exiting Scripts handle everything well except the remote connection stability so I'm more inclined to try to address the dropped connection in Bash rather than replacing about 750 lines of bash code.

Alternate Apporach Ideas Welcome but as I said I'd prefer a solution that will work with my existing bash script.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Jan 28, 2024

borg has placeholders, in case you want to precisely reproduce these archive names. borg help placeholders

BTW, the {now} placeholder has a quite similar, but more iso 8601 like datetime format.

Dropped connections: sometimes caused by bad networking hw, e.g. routers. Not much borg can do about that, but you can use a shorter --checkpoint-interval. As long as the amount of chunks increases in the repo, you will converge against backup completion, even if you need to restart the backup often.

borg 1.4 (currently in beta) will add more specific return codes. Do not use it for production yet, but check if it solves your scripting problem.

600 GB snapshot size is a lot, do you really have that much changed data?

Instead of making more space for the snapshot, you could also delete the snapshot and make a fresh one and start a new backup, hoping that there is significant overlap and it deduplicates most of the data still.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants