Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restic fails on repo mounted via CIFS/samba on Linux using go 1.14 build #2659

Open
MichaelEischer opened this issue Mar 21, 2020 · 39 comments
Labels

Comments

@MichaelEischer
Copy link
Member

MichaelEischer commented Mar 21, 2020

This issue is a summary of https://forum.restic.net/t/prune-fails-on-cifs-repo-using-go-1-14-build/2579 , intended as a reference for the underlying problem.

tl;dr Workaround: Setting the environment variable GODEBUG to asyncpreemptoff=1 restores the pre Go 1.14 behavior and fixes the problem.

Output of restic version

restic 0.9.6 (v0.9.6-137-gc542a509) compiled with go1.14 on linux/amd64
Linux Kernel version: 5.5.9

How did you run restic exactly?

restic prune -r /path/to/repository/on/CIFS/share

Relevant log excerpt:

Load(<data/216ea5f2d2>, 591, 4564169) returned error, retrying after 720.254544ms: open /mnt/nas/redacted/reponame/data/21/216ea5f2d21b458a7b913609ddef2a6ac4788b4bad5481b2916558d2ce1bef04: interrupted system call

Prune failed in the end

Further relevant log excerpts:

Load(<data/2e9db0642e>, 591, 4758136) returned error, retrying after 552.330144ms: read /mnt/nas/redacted/reponame/data/2e/2e9db0642e0fb67b959aa1d91c0d70daa8331ad246c5eeb8582ba2a14f24680f: interrupted system call
List(data) returned error, retrying after 282.818509ms: lstat /mnt/nas/redacted/reponame/data/64: interrupted system call
List(data) returned error, retrying after 492.389441ms: readdirent: interrupted system call
Save(<data/f0f5102554>) returned error, retrying after 552.330144ms: chmod /mnt/nas/redacted/reponame/data/f0/f0f51025542c0287943ef3816e642586be46ae10dc9efbcfa7b305d9e093dbd4: interrupted system call

What backend/server/service did you use to store the repository?

Local backend stored on a CIFS share

Expected behavior

No warnings, prune should complete.

Actual behavior

Prune failed.

Steps to reproduce the behavior

Build restic using Go 1.14 and store the backup repository on a CIFS share.

Do you have any idea what may have caused this?

This issue is a side effect of asynchronous preemptions in go 1.14. The [https://golang.org/doc/go1.14#runtime](release notes) state the following:

This means that programs that use packages like syscall or golang.org/x/sys/unix will see more slow system calls fail with EINTR errors. Those programs will have to handle those errors in some way, most likely looping to try the system call again.

Go configures signal handlers to restart syscalls if possible. The standard library also retries syscalls when necessary. That is there should only be issues when directly calling low-level syscalls and in that case one should just implement things properly. However, restic just uses go standard library functions that should already handle EINTR if necessary.

The first prune error message points to an os.Open call (via fs.Open) in the Load function of the local backend. So it looks like a Go standard library call fails. However, the manpage for signal (man 7 signal) states that the open syscall, that is called underneath, is always restarted when using SA_RESTART as is done by Go. So this seems to be a bug in the Linux kernel. Adding a loop around the call to fs.Open to repeat it as long as EINTR is returned, fixes that one call. Fixing all problematic calls would end up adding lots of ugly loops and playing whack-a-mole.

The manpages of lstat, readdir and chmod don't even list EINTR as a possible errno.

Do you have an idea how to solve the issue?

Setting the environment variable GODEBUG to asyncpreemptoff=1 restores the pre Go 1.14 behavior and fixes the problem.

Go relies on the assumption that the kernel properly restarts syscalls when told to do so. As that latter is obviously not the case, the proper fix would be to submit a bug report to the linux kernel.

A short-term solution would be to add a note to the restic documentation that mentions the compatibility problem with CIFS mounts.

MichaelEischer added a commit to MichaelEischer/restic that referenced this issue Mar 26, 2020
On Linux CIFS (SMB) seems to be incompatible with the async preemption
implementation of Go 1.14. CIFS seems not to restart syscalls (open,
read, chmod, readdir, ...) as expected by Go, which sets SA_RESTART for
its signal handler to have syscalls restarted automatically. This leads
to Go passing up lots of EINTR return codes to restic.

See restic#2659 for a detailed explanation.
seqizz pushed a commit to seqizz/restic that referenced this issue May 28, 2020
On Linux CIFS (SMB) seems to be incompatible with the async preemption
implementation of Go 1.14. CIFS seems not to restart syscalls (open,
read, chmod, readdir, ...) as expected by Go, which sets SA_RESTART for
its signal handler to have syscalls restarted automatically. This leads
to Go passing up lots of EINTR return codes to restic.

See restic#2659 for a detailed explanation.
@rbolog
Copy link

rbolog commented Oct 4, 2020

Hi,
I can confirm that, I was unable to make a 400GB backup on a Samba share.
It's reproducible (I tried three times)
When I do a check it finds errors if I do a rebuild-index after the check generates error pages and repository become crashed. I thought it was my Samba configuration or the destination disk so I copied large amounts of data without any problems. And now I use the rest-server and it works. So there is a problem around Samba. Note I tried with and without export GODEBUG=asyncpreemptoff=1 without it crashes after a second. I also don't see any errors in journalctl server and client.

In that configuration the data and restic is on local disk and restic destination repository on Samba share disk. Something like "samba write"

go version go1.15.2 linux/amd64
restic 0.10.0 compiled with go1.15.2 on linux/amd64


I did another backup from a music server (bluesound) where the data are remote and restic and repo are local Something like "Samba read". I got (without GODEBUG=asyncpreemptoff=1 ) few warning but the repo check was ok. With GODEBUG=asyncpreemptoff=1 I got no problems.

restic 0.10.0 compiled with go1.15.2 on linux/arm64
go version go1.15.2 linux/arm64

@MichaelEischer
Copy link
Member Author

Reported by @Mikescher in #2968

Summary

After upgrading to restic 0.10.0 I get read errors when backing up a mounted samba share.
After downgrading to 0.9.6 the errors are once again gone.

Output of restic version

restic 0.10.0 compiled with go1.15.2 on linux/amd64   (before)
restic 0.9.6 compiled with go1.13.4 on linux/amd64    (after downgrade)

How did you run restic exactly?

I have a local restic repository and a mounted samba/cifs share (from my NAS) that I want to backup.
I simply run restic -r {local_repo} backup /mnt/nas_hektor --cleanup-cache --exclude-file {exclusions}.txt

Output:

repository ac08473e opened successfully, password is correct
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: lstat /mnt/nas_hektor/{...}: interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: Readdirnames /mnt/nas_hektor/{...} failed: readdirent: no such file or directory
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call
error: NodeFromFileInfo: Listxattr: xattr.list /mnt/nas_hektor/{...} : interrupted system call

Files:         834 new,     0 changed, 17209 unmodified
Dirs:           46 new,  1191 changed,     0 unmodified
Added to the repo: 251.044 GiB

processed 18043 files, 8.624 TiB in 3:40:12
snapshot db1a5445 saved
Warning: failed to read all source data during backup

The /var/log/syslog is sprinkled many of the following (seemingly relevant) errors:

Oct  2 01:42:04 omv-{name} kernel: [21401.666794] CIFS VFS: Send error in read = -4
Oct  2 01:05:45 omv-{name} kernel: [19223.167367] CIFS VFS: \\{nas}.fritz.box\Hektor Close interrupted close

What backend/server/service did you use to store the repository?

The repository is simply stored on the local filesystem, only the actual to-be-backed-up data is mounted via cifs

Expected behavior

Well, no errors..

Actual behavior

restic displays errors while running, ends with an error message and actually skips these files in the snapshot.
Calling restic again backs the files up without problem (I can also without problem stat/list the files/directories in my terminal).
It seems to just fail at random and after 3 invocations of restic backup I had all files backed up (after that I downgraded for the other backups I wanted to do)

Steps to reproduce the behavior

Not really sure if it's reproducible for everyone who backs up (a lot of) data from a cifs mountpoint.
I tried to eliminate other problem sources (see below) but I don't know how much it depends on my setup/network/whatever

Do you have any idea what may have caused this?

No idea.
I can say that I tried a lot of other things before downgrading restic because it really sounds like a samba problem.
I did a full software update on my machine, restarted my synology nas, multiple times restarted my machine, tried tweaking samba mount options etc.
But because a simple restic downgrade eliminated all problems it seems like restic at least started triggering some cifs bug since 0.10.0

Btw both machines are in the same local network, connected by a Gbit switch and stand only a meter from each other, so teh network should be very stable.

Do you have an idea how to solve the issue?

Nope, sorry.

Did restic help you today? Did it make you happy in any way?

Well not happy but I'm still convinced that you simply managed to trigger some stupid bug in mount.cifs and because I know how annoying such stuff is I feel kinda sorry to report it 😨

@MichaelEischer
Copy link
Member Author

Analysis by @greatroar in #2968:

Oct  2 01:42:04 omv-{name} kernel: [21401.666794] CIFS VFS: Send error in read = -4

That's an EINTR.

This is might be the new async preemption in the Go 1.14 runtime. The best way to tell would be to compile restic 0.10.0 with Go 1.13.4 and try again. But before you go installing compilers, maybe try setting GOMAXPROCS=1 GODEBUG=asyncpreemptoff=1:

$ export GODEBUG=asyncpreemptoff=1
$ restic backup /mnt/nas_hektor --cleanup-cache --exclude-file exclusions.txt

Related: golang/go#39237, golang/go#40870 (especially this comment).

I think we should could put an EINTR loop around xattr, or maybe get the maintainers of pkg/xattr to do that. As for the Readdirnames failures, the Go people have decided that's a kernel bug that they're not going to work around, unless someone shows that it isn't.

@greatroar
Copy link
Contributor

The consensus among maintainers (expressed at #3061) is to wait until Go 1.16, which will include a thorough fix in the stdlib. If that's right, may I suggest adding a summary of/link to the workaround at the top and pinning this issue? It comes up regularly and will continue to do so until ca. February.

@fd0
Copy link
Member

fd0 commented Nov 6, 2020

I'll pin this issue for now

@fd0 fd0 pinned this issue Nov 6, 2020
@fd0 fd0 changed the title Prune fails on CIFS repo using go 1.14 build restic fails on repo mounted via CIFS/samba using go 1.14 build Nov 6, 2020
greatroar pushed a commit to greatroar/restic that referenced this issue Nov 8, 2020
Updates restic#2659. This is one of the cases where the stdlib will not handle
EINTR for us, even with Go 1.16. That xattr calls are directly affected
can be seen in the report for issue restic#2968.
greatroar pushed a commit to greatroar/restic that referenced this issue Nov 8, 2020
Updates restic#2659. This is a case where the stdlib will not handle EINTR for
us, even with Go 1.16. That xattr calls are directly affected can be
seen in the report for issue restic#2968.
@JPPhoto

This comment has been minimized.

@MichaelEischer

This comment has been minimized.

@MichaelEischer MichaelEischer changed the title restic fails on repo mounted via CIFS/samba using go 1.14 build restic fails on repo mounted via CIFS/samba on Linux using go 1.14 build Nov 30, 2020
@uok
Copy link

uok commented Dec 6, 2020

using GODEBUG and related nouser_xattr from #1800 helped me fix the problems in my script (https://forum.restic.net/t/error-lstat-errno-524-during-backup/3272)

@CyberKiller40
Copy link

The GODEBUG workaround helped me fix my backup. I use restic to bacup my NAS to object storage and this problem got me from making backups for months.
export GODEBUG=asyncpreemptoff=1

@MichaelEischer
Copy link
Member Author

Go 1.16 has been released in the meantime. Could anyone try whether this finally solves the compatibility problems with CIFS on Linux?

@MichaelEischer
Copy link
Member Author

@csss1234 No. This issue only applies when the host running restic uses Linux.

@MichaelEischer
Copy link
Member Author

@underhillian Which Kernel version are you using?

Did anyone else notice problems with restic + CIFS + go 1.16?

We could add a check to prevent using a repository stored on a CIFS mount on Linux. That would basically enforce the remark that currently exists in the documentation. The big downside of this workaround is that we'd have no clue at all whether that option will still be necessary in the future or not.

diff --git a/internal/backend/local/local.go b/internal/backend/local/local.go
index 0410e51b..d58d4880 100644
--- a/internal/backend/local/local.go
+++ b/internal/backend/local/local.go
@@ -5,6 +5,8 @@ import (
        "io"
        "os"
        "path/filepath"
+       "runtime"
+       "strings"
        "syscall"

        "github.com/restic/restic/internal/errors"
@@ -31,6 +33,24 @@ const defaultLayout = "default"
 // Open opens the local backend as specified by config.
 func Open(ctx context.Context, cfg Config) (*Local, error) {
        debug.Log("open local backend at %v (layout %q)", cfg.Path, cfg.Layout)
+
+       if runtime.GOOS == "linux" {
+               dbg, _ := os.LookupEnv("GODEBUG")
+               hasAsyncPreempt := !strings.Contains(dbg, "asyncpreemptoff=1")
+               if hasAsyncPreempt {
+                       var stat syscall.Statfs_t
+                       err := syscall.Statfs(cfg.Path, &stat)
+                       if err != nil {
+                               return nil, err
+                       }
+                       const CIFS_MAGIC_NUMBER = 0xff534d42
+                       if stat.Type == CIFS_MAGIC_NUMBER {
+                               return nil, errors.Fatal("Storing a repository on CIFS requires disabling" +
+                                       "asynchronous preemption by setting the environment variable GODEBUG to 'asyncpreemptoff=1'.")
+                       }
+               }
+       }
+
        l, err := backend.ParseLayout(ctx, &backend.LocalFilesystem{}, cfg.Layout, defaultLayout, cfg.Path)
        if err != nil {
                return nil, err

@underhillian
Copy link

Which Kernel version are you using?

I use Arch Linux (rolling updates) and update regularly so my kernel is almost never more than a couple of weeks behind linux-stable.

As best as I can tell from my trusty restic backups:wink:, at the time of my last test as reported above (Mar 07), I was running 5.11.2 (although it might have been a minor release or two later than that...can't be more precise at this point).

@thomasf
Copy link

thomasf commented Aug 7, 2021

I just upgraded a server at home from ubuntu LTS from 18.04 to 20.04 and now I can't run
backups anymore on files that are on my NAS from my cifs mounted path.

I also upgraded restic before trying so I don't know which upgrade caused the problems.

This is all over the restic backup output now:

I don't have the repositoryu stored on the cifs mounted volume, I am only trying to create a backup of files that are mounted on a cifs file system.

input/output errorc ...redacted... error: NodeFromFileInfo: Listxattr: xattr.list 

If I run stat on one of the failed files I don't get any errors.

$ stat "...redacted..."
  File: ...redacted...
  Size: 5493072         Blocks: 10736      IO Block: 1048576 regular file
Device: 35h/53d Inode: 399240      Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1000/ thomasf)   Gid: (    0/    root)
Access: 2021-06-04 15:31:45.395788000 +0000
Modify: 2008-01-27 19:17:46.000000000 +0000
Change: 2008-01-27 19:17:46.000000000 +0000
 Birth: -

My kernel is

Linux gems 5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

@rawtaz
Copy link
Contributor

rawtaz commented Aug 7, 2021

I also upgraded restic before trying so I don't know which upgrade caused the problems.

Please specify restic version. Lots of Linux distros use outdated restic versions, your best course of action to have the best restic version is to simply use the official restic binaries published here on GitHub.

@thomasf
Copy link

thomasf commented Aug 7, 2021

0.12.1 via the self-update subcommand

I also upgraded restic before trying so I don't know which upgrade caused the problems.

Please specify restic version. Lots of Linux distros use outdated restic versions, your best course of action to have the best restic version is to simply use the official restic binaries published here on GitHub.

@thomasf
Copy link

thomasf commented Aug 8, 2021

Hmm, when I write a test specifically targeting the same file that fails with i/o error inside resic it does not cause an i/o error. I might have missed something.

restic debug log:

2021/08/08 08:56:38 archiver/archiver.go:353	archiver.(*Archiver).Save	56	/media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw41.jpg target "/media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw41.jpg", previous <nil>
2021/08/08 08:56:38 archiver/archiver.go:384	archiver.(*Archiver).Save	56	  /media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw41.jpg regular file
2021/08/08 08:56:38 restic/node.go:622	restic.(*Node).fillExtendedAttributes	61	fillExtendedAttributes(/media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw40.jpg) [] Listxattr: xattr.list /media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw40.jpg : input/output error
2021/08/08 08:56:38 archiver/file_saver.go:145	archiver.(*FileSaver).saveFile	60	/media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw41.jpg
2021/08/08 08:56:38 archiver/archiver.go:492	archiver.(*Archiver).Save	56	return after 0.003

simple test:

package restic

import (
	"os"
	"syscall"
	"testing"

	"github.com/davecgh/go-spew/spew"
	"github.com/restic/restic/internal/restic"
)

const (
	fn = "/media/dubcube/homes/thomasf/Pictures/_VERY_RANDOM/1999/08/17/jw41.jpg"
)

func TestNodeFromFileInfo(t *testing.T) {

	fi, err := os.Stat(fn)
	if err != nil {
		t.Fatal(err)
	}

	stat, ok := toStatT(fi.Sys())
	t.Log(ok, stat)
	if !ok {
		t.Fail()
	}

	node, err := restic.NodeFromFileInfo(fn, fi)
	if err != nil {
		t.Fatal(err)
	}
	t.Log(spew.Sdump(node))
}

type statT syscall.Stat_t

func toStatT(i interface{}) (*statT, bool) {
	s, ok := i.(*syscall.Stat_t)
	if ok && s != nil {
		return (*statT)(s), true
	}
	return nil, false
}

test output:

$ ssh gems ./foo.test -test.v
=== RUN   TestNodeFromFileInfo
    random_test.go:24: true &{49 18317912 1 33261 1000 0 0 0 34898 1048576 72 {1622819926 309775400} {1283468376 0} {1283468376 0} [0 0 0]}
    random_test.go:33: (*restic.Node)(0xc0000d4640)(-rwxr-xr-x  1000     0  34898 2010-09-02 22:59:36 +0000 UTC jw41.jpg)

--- PASS: TestNodeFromFileInfo (0.02s)
PASS

@MichaelEischer
Copy link
Member Author

Is always the same file affected or does the error affect different files each time? The testcase probably only works in the former case.

Another major difference to the situation in restic, is that the overall load is completely different. restic usually keeps the system quite busy, whereas the testcase is a single command without putting any noteworthy stress on the system. As we know that the CIFS problems are triggered by the async. preemption feature of go, your testcase would have to run the problematic syscall at the time when the preemption signal arrives. For that the testcase would at least have to repeat the operation over and over again (probably on different files).

Please test whether setting GODEBUG=asyncpreemptoff=1 solves the problem for you.

@thomasf
Copy link

thomasf commented Aug 8, 2021

GODEBUG=asyncpreemptoff=1 does not seem to help.

It looks like it's different files every run.

It is a pretty slow home server (Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz) so it is probably more vulnerable to timing than the average modern machine.

I also tried setting all the archiver concurrencies in the code to 1 and adding a few time.Sleep(200ms) here and there around the areas where the errors appear. With the sleep it seems like error rate maybe went down a bit but far from disappeard.

I will try to set up a smaller test repo where I don't have tens of thousands of files for every run and see if I can narrow anything down.

@thomasf
Copy link

thomasf commented Aug 9, 2021

I have tried a bunch of different cifs mount settings now without any luck.

An flag that enabled you yo just skip extended attributes would also work (at least for me). It is only the listxattr syscall that fails and I don't think I use any of that for my cifs mounts.

I just did this to see what would happen and now I can at least run my backups without many thousands of errors

func (node *Node) fillExtendedAttributes(path string) error {
	return nil

So this is probably not the same fault that this issue is really about but it is somewhat related. The syscalls does not as far as I can understand time out at all, they actually fail with io error which is what strace also says.

Update:

After changing to not doing the listxattr syscall I could run my ~60k files ~3TB backups from the cifs mount without issues with or without GODEBUG=asyncpreemptoff=1 without any unexpected errors.

@MichaelEischer
Copy link
Member Author

Did you try the nouser_xattr mount option?

It is a pretty slow home server (Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz) so it is probably more vulnerable to timing than the average modern machine.

The symptoms sound like a serious bug in either the CIFS server or client. Or maybe it is also a network problem. Although the CPU of the server isn't the fastest, it's also not the slowest one either.

@thomasf
Copy link

thomasf commented Aug 10, 2021

Did you try the nouser_xattr mount option?

I missed that one and it also seems to work , have been running through a few tens of thousands files already and should definitely have failed by now.

It is a pretty slow home server (Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz) so it is probably more vulnerable to timing than the average modern machine.

The symptoms sound like a serious bug in either the CIFS server or client. Or maybe it is also a network problem. Although the CPU of the server isn't the fastest, it's also not the slowest one either.

I ran a few network load tests and even upgraded the switch firmware so it's probably not that. It's probably some interaction between the samba/kernel version of my qnap NAS and the CIFS client in the latest Ubuntu LTS kernel.

In any case nouser_xattr seems to work so It's all good for now.

@uok
Copy link

uok commented Aug 10, 2021

After many months of using nouser_xattr this problem still occurs from time to time. Same dataset, roughly same changes every day - most days no errors, some days there are. I still cannot see a pattern why this is happening :-(

@imkyaky
Copy link

imkyaky commented Nov 21, 2021

same issue here. trying to backup 1.5T data and process interrupted all the time with different files.
version: restic 0.12.0 compiled with go1.15.5 on linux/amd64

@MichaelEischer
Copy link
Member Author

@imkyaky As mentioned in one of the earlier posts here, go 1.16 includes some additional fixes to handle the interruptions. Can you try whether restic 0.12.0 or 0.12.1 build using go 1.16 (or even better 1.17) resolves the problem? Otherwise, try also setting the environment variable GODEBUG=asyncpreemptoff=1

@greatroar
Copy link
Contributor

Relevant: rclone/rclone#2042. If rclone gets SMB support, this issue can be worked around, and it will work on all platforms. I'm not volunteering, but if anyone needs a summer project...

mfrischknecht pushed a commit to mfrischknecht/restic that referenced this issue Jun 14, 2022
Updates restic#2659. This is a case where the stdlib will not handle EINTR for
us, even with Go 1.16. That xattr calls are directly affected can be
seen in the report for issue restic#2968.
@MichaelEischer
Copy link
Member Author

Is this issue still relevant with recent restic versions, that is restic 0.14.0?

@MichaelEischer MichaelEischer added the state: need feedback waiting for feedback, e.g. from the submitter label Oct 30, 2022
@underhillian
Copy link

I created a small (~25GB) repo as a test case and went through a series of backup, forget, and prune operations similar to those where I saw errors originally. Everything worked perfectly.

I performed the testing using

restic 0.14.0 compiled with go1.19 on linux/amd64

under linux 6.06. I did not set GODEBUG,

This wasn't an exhaustive test by any means, but it was extensive enough that (based on my previous experience) I would have expected to see multiple errors if the issue was still present. So the issue has very likely been resolved.

@greatroar greatroar mentioned this issue Nov 27, 2022
7 tasks
@MrM40
Copy link

MrM40 commented Dec 13, 2022

The manual doesn't mention that this is also an issues when reading data from a CIFS share.
Please correct the text to also say you should use GODEBUG to asyncpreemptoff=1 when reading from a CIFS share.
URL: https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html?highlight=cifs#local
cifs

@DanielGibson
Copy link
Contributor

due to compatibility issues in older Linux kernels

does this mean it's fixed in newer kernels? would make sense to document what kernel version fixed it

@MichaelEischer
Copy link
Member Author

due to compatibility issues in older Linux kernels

does this mean it's fixed in newer kernels? would make sense to document what kernel version fixed it

We unfortunately have no real idea which versions are affected, I'm not actually sure how exactly the problem can be reproduced efficiently. Recent Linux (kernel) versions seem to be unproblematic.

@anderspitman
Copy link

Relevant: rclone/rclone#2042. If rclone gets SMB support, this issue can be worked around, and it will work on all platforms. I'm not volunteering, but if anyone needs a summer project...

Just wanted to note that rclone has merged support for SMB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests