Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restic 0.10.0 causes read errors on mounted samba share #2968

Closed
Mikescher opened this issue Oct 2, 2020 · 9 comments
Closed

Restic 0.10.0 causes read errors on mounted samba share #2968

Mikescher opened this issue Oct 2, 2020 · 9 comments

Comments

@Mikescher
Copy link

Summary

After upgrading to restic 0.10.0 I get read errors when backing up a mounted samba share.
After downgrading to 0.9.6 the errors are once again gone.

Output of restic version

restic 0.10.0 compiled with go1.15.2 on linux/amd64   (before)
restic 0.9.6 compiled with go1.13.4 on linux/amd64    (after downgrade)

How did you run restic exactly?

I have a local restic repository and a mounted samba/cifs share (from my NAS) that I want to backup.
I simply run restic -r {local_repo} backup /mnt/nas_hektor --cleanup-cache --exclude-file {exclusions}.txt

Output:

repository ac08473e opened successfully, password is correct
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: lstat /mnt/nas_hektor/{...}: interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call

Files:         834 new,     0 changed, 17209 unmodified
Dirs:           46 new,  1191 changed,     0 unmodified
Added to the repo: 251.044 GiB

processed 18043 files, 8.624 TiB in 3:40:12
snapshot db1a5445 saved
Warning: failed to read all source data during backup

The /var/log/syslog is sprinkled many of the following (seemingly relevant) errors:

Oct  2 01:42:04 omv-{name} kernel: [21401.666794] CIFS VFS: Send error in read = -4
Oct  2 01:05:45 omv-{name} kernel: [19223.167367] CIFS VFS: \\{nas}.fritz.box\Hektor Close interrupted close

What backend/server/service did you use to store the repository?

The repository is simply stored on the local filesystem, only the actual to-be-backed-up data is mounted via cifs

Expected behavior

Well, no errors..

Actual behavior

restic displays errors while running, ends with an error message and actually skips these files in the snapshot.
Calling restic again backs the files up without problem (I can also without problem stat/list the files/directories in my terminal).
It seems to just fail at random and after 3 invocations of restic backup I had all files backed up (after that I downgraded for the other backups I wanted to do)

Steps to reproduce the behavior

Not really sure if it's reproducible for everyone who backs up (a lot of) data from a cifs mountpoint.
I tried to eliminate other problem sources (see below) but I don't know how much it depends on my setup/network/whatever

Do you have any idea what may have caused this?

No idea.
I can say that I tried a lot of other things before downgrading restic because it really sounds like a samba problem.
I did a full software update on my machine, restarted my synology nas, multiple times restarted my machine, tried tweaking samba mount options etc.
But because a simple restic downgrade eliminated all problems it seems like restic at least started triggering some cifs bug since 0.10.0

Btw both machines are in the same local network, connected by a Gbit switch and stand only a meter from each other, so teh network should be very stable.

Do you have an idea how to solve the issue?

Nope, sorry.

Did restic help you today? Did it make you happy in any way?

Well not happy but I'm still convinced that you simply managed to trigger some stupid bug in mount.cifs and because I know how annoying such stuff is I feel kinda sorry to report it 😨

@greatroar
Copy link
Contributor

greatroar commented Oct 2, 2020

Oct  2 01:42:04 omv-{name} kernel: [21401.666794] CIFS VFS: Send error in read = -4

That's an EINTR.

This is might be the new async preemption in the Go 1.14 runtime. The best way to tell would be to compile restic 0.10.0 with Go 1.13.4 and try again. But before you go installing compilers, maybe try setting GOMAXPROCS=1 GODEBUG=asyncpreemptoff=1:

$ export GODEBUG=asyncpreemptoff=1
$ restic backup /mnt/nas_hektor --cleanup-cache --exclude-file exclusions.txt

@greatroar
Copy link
Contributor

greatroar commented Oct 2, 2020

Related: golang/go#39237, golang/go#40870 (especially this comment).

I think we should could put an EINTR loop around xattr, or maybe get the maintainers of pkg/xattr to do that. As for the Readdirnames failures, the Go people have decided that's a kernel bug that they're not going to work around, unless someone shows that it isn't.

@MichaelEischer
Copy link
Member

This looks like #2659 to me. @Mikescher could you merge your bug report into that issue?

@rbolog
Copy link

rbolog commented Oct 2, 2020

Hi,
I can confirm that, I was unable to make a 400GB backup on a Samba share.
It's reproducible (I tried three times)
When I do a check it finds errors if I do a rebuild-index after the check generates error pages. I thought it was my Samba configuration or the destination disk so I copied large amounts of data without any problems. And now I use the rest-server and it works. So there is a problem around Samba. Note I tried with and without export GODEBUG=asyncpreemptoff=1 without it crashes after a second. I also don't see any errors in journalctl server and client.

go version go1.15.2 linux/amd64
restic 0.10.0 compiled with go1.15.2 on linux/amd64

@Mikescher
Copy link
Author

Hmm yeah, I guess that pretty much sounds like my problem.

You can close this issue in favor of #2659

@MichaelEischer
Copy link
Member

@Mikescher Please add your detailed bug report to #2659, it's pretty useful to have a problem report with the latest restic version there.

@MichaelEischer
Copy link
Member

I've copy most comments over to #2659

greatroar pushed a commit to greatroar/restic that referenced this issue Nov 8, 2020
Updates restic#2659. This is one of the cases where the stdlib will not handle
EINTR for us, even with Go 1.16. That xattr calls are directly affected
can be seen in the report for issue restic#2968.
greatroar pushed a commit to greatroar/restic that referenced this issue Nov 8, 2020
Updates restic#2659. This is a case where the stdlib will not handle EINTR for
us, even with Go 1.16. That xattr calls are directly affected can be
seen in the report for issue restic#2968.
@imkyaky
Copy link

imkyaky commented Nov 21, 2021

can confirm still have this problem with version: restic 0.12.0 compiled with go1.15.5 on linux/amd64ca

@rbolog
Copy link

rbolog commented Nov 21, 2021

Hi,
I did a successful test using:
restic 0.12.1 compiled with go1.17.3 on linux/amd64

Result:

Files:       1618061 new,     0 changed,     0 unmodified
Dirs:        195675 new,     0 changed,     0 unmodified
Added to the repo: 20.788 GiB

processed 1618061 files, 74.381 GiB in 43:23
snapshot afce1bee saved

mfrischknecht pushed a commit to mfrischknecht/restic that referenced this issue Jun 14, 2022
Updates restic#2659. This is a case where the stdlib will not handle EINTR for
us, even with Go 1.16. That xattr calls are directly affected can be
seen in the report for issue restic#2968.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants