Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed backup stated as succeeded #910

Open
xoxys opened this issue Nov 19, 2023 · 2 comments
Open

Failed backup stated as succeeded #910

xoxys opened this issue Nov 19, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@xoxys
Copy link

xoxys commented Nov 19, 2023

Description

Backups are stated as Succeeded even if the backup commands failed. This is just an example, the reason why the backup failed has already been fixed. However, it is problematic to describe a defective backup as successful.

Additional Context

No response

Logs

❯ kubectl get backups.k8up.io -n authelia-public 
NAME                    SCHEDULE REF   COMPLETION   PREBACKUP   AGE
postgres                               Succeeded    Finished    12h
postgres-backup-qbq6q   postgres       Succeeded    Finished    12h
2023-11-18T23:02:31Z	INFO	k8up	Starting k8up…	{"version": "2.7.2", "date": "2023-10-09T10:13:29Z", "commit": "45d99dd90dbb2a080e6832c34e96b371216a3e0b", "go_os": "linux", "go_arch": "amd64", "go_version": "go1.19.13", "uid": 65532, "gid": 0}
2023-11-18T23:02:31Z	INFO	k8up.restic	initializing
2023-11-18T23:02:31Z	INFO	k8up.restic	setting up a signal handler
2023-11-18T23:02:31Z	INFO	k8up.restic.restic	using the following restic options	{"options": [""]}
2023-11-18T23:02:31Z	INFO	k8up.restic.restic.RepoInit.command	restic command	{"path": "/usr/local/bin/restic", "args": ["init", "--option", ""]}
2023-11-18T23:02:31Z	INFO	k8up.restic.restic.RepoInit.command	Defining RESTIC_PROGRESS_FPS	{"frequency": 0.016666666666666666}
2023-11-18T23:02:32Z	INFO	k8up.restic.restic.unlock	unlocking repository	{"all": false}
2023-11-18T23:02:32Z	INFO	k8up.restic.restic.unlock.command	restic command	{"path": "/usr/local/bin/restic", "args": ["unlock", "--option", ""]}
2023-11-18T23:02:32Z	INFO	k8up.restic.restic.unlock.command	Defining RESTIC_PROGRESS_FPS	{"frequency": 0.016666666666666666}
2023-11-18T23:02:36Z	INFO	k8up.restic.restic.snapshots	getting list of snapshots
2023-11-18T23:02:36Z	INFO	k8up.restic.restic.snapshots.command	restic command	{"path": "/usr/local/bin/restic", "args": ["snapshots", "--option", "", "--json"]}
2023-11-18T23:02:36Z	INFO	k8up.restic.restic.snapshots.command	Defining RESTIC_PROGRESS_FPS	{"frequency": 0.016666666666666666}
2023-11-18T23:02:43Z	INFO	k8up.restic.k8sClient	listing all pods	{"annotation": "k8up.io/backupcommand", "namespace": "authelia-public"}
2023-11-18T23:02:43Z	INFO	k8up.restic.k8sClient	adding to backup list	{"namespace": "authelia-public", "pod": "pgdump-77788b7db9-n4tp6"}
2023-11-18T23:02:43Z	INFO	k8up.restic.k8sExec	executing command	{"command": "sh, -c, chmod 600 /var/lib/postgresql/.pgpass && pg_dump --clean", "namespace": "authelia-public", "pod": "pgdump-77788b7db9-n4tp6"}
2023-11-18T23:02:43Z	INFO	k8up.restic.restic.stdinBackup	starting stdin backup	{"filename": "/authelia-public-pgdump", "extension": ".sql"}
2023-11-18T23:02:43Z	INFO	k8up.restic.restic.stdinBackup.command	restic command	{"path": "/usr/local/bin/restic", "args": ["backup", "--option", "", "--stdin-filename", "/authelia-public-pgdump.sql", "--host", "authelia-public", "--json", "--stdin"]}
2023-11-18T23:02:43Z	INFO	k8up.restic.restic.stdinBackup.command	Defining RESTIC_PROGRESS_FPS	{"frequency": 0.016666666666666666}
2023-11-18T23:02:43Z	INFO	k8up.restic.pgdump-77788b7db9-n4tp6.stderr	chmod: changing permissions of '/var/lib/postgresql/.pgpass': Read-only file system
2023-11-18T23:02:43Z	ERROR	k8up.restic.k8sExec	streaming data failed	{"namespace": "authelia-public", "pod": "pgdump-77788b7db9-n4tp6", "error": "command terminated with exit code 1"}
github.com/k8up-io/k8up/v2/restic/kubernetes.PodExec.func1
	/home/runner/work/k8up/k8up/restic/kubernetes/pod_exec.go:74
2023-11-18T23:02:48Z	INFO	k8up.restic.restic.stdinBackup.progress	restic output	{"msg": "{\"message_type\":\"error\",\"error\":{\"Op\":\"read\",\"Path\":\"/authelia-public-pgdump.sql\",\"Err\":{}},\"during\":\"archival\",\"item\":\"/authelia-public-pgdump.sql\"}"}
2023-11-18T23:02:48Z	ERROR	k8up.restic.restic.stdinBackup.progress	/authelia-public-pgdump.sql during archival read	{"error": "error occurred during backup"}
github.com/k8up-io/k8up/v2/restic/logging.(*BackupOutputParser).out
	/home/runner/work/k8up/k8up/restic/logging/logging.go:156
github.com/k8up-io/k8up/v2/restic/logging.writer.Write
	/home/runner/work/k8up/k8up/restic/logging/logging.go:103
io.copyBuffer
	/opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:429
io.Copy
	/opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:386
os/exec.(*Cmd).writerDescriptor.func1
	/opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:407
os/exec.(*Cmd).Start.func1
	/opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:544
2023-11-18T23:02:48Z	INFO	k8up.restic.restic.stdinBackup.progress	backup finished	{"new files": 0, "changed files": 0, "errors": 1}
2023-11-18T23:02:48Z	INFO	k8up.restic.restic.stdinBackup.progress	stats	{"time": 2.627027521, "bytes added": 0, "bytes processed": 0}
2023-11-18T23:02:48Z	INFO	k8up.restic.restic.MountCollector	stats mount dir doesn't exist, skipping stats	{"dir": "/data"}
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.stdinBackup.progress	restic output	{"msg": "Warning: at least one source file could not be read"}
2023-11-18T23:02:49Z	INFO	k8up.restic	backups of annotated jobs have finished successfully
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.backup	starting backup
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.backup	backupdir does not exist, skipping. Sending snapshot list	{"dirname": "/data"}
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.snapshots	getting list of snapshots
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.snapshots.command	restic command	{"path": "/usr/local/bin/restic", "args": ["snapshots", "--option", "", "--json"]}
2023-11-18T23:02:49Z	INFO	k8up.restic.restic.snapshots.command	Defining RESTIC_PROGRESS_FPS	{"frequency": 0.016666666666666666}


### Expected Behavior

Failed backups should be stated as failed instead.

### Steps To Reproduce

_No response_

### Version of K8up

v2.7.2

### Version of Kubernetes

v1.27.7+k3s1

### Distribution of Kubernetes

K3s
@xoxys xoxys added the bug Something isn't working label Nov 19, 2023
@poyaz
Copy link
Contributor

poyaz commented Dec 11, 2023

Hi

I checked this problem, and after testing this situation I recognized this problem happened because of restic command.

When the backupcommand annotation is executed, the stdin of the command pipe into restic command, and the restic store stream data in the snapshot

Unfortunately, can't fix this problem because of restic. But k8up has a summary backup for detecting the status of the backup. You can use Webhook or Prometheus to get the status of the backup

Also, I have a solution for fixing this problem:
We can add an annotation to handle errors in the backup command and delete the snapshot when the backup command fails. This backward compatible

@xoxys @Kidswiss

@roobre
Copy link

roobre commented May 13, 2024

Just wanted to plusone this. I had this problem happen to me with two different workloads for different reasons (xz not being available, and a wrong env syntax for postgres).

If I hadn't checked manually with restic, I wouldn't have noticed this! I think having failed backups marked as such would be a great UX improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: New
Development

No branches or pull requests

3 participants