Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows - Reparse points / cloud files creating a lot of errors in logs #4155

Open
deajan opened this issue Jan 14, 2023 · 7 comments
Open
Labels
category: backup platform: windows state: need direction need key decisions or input from core developers state: need feedback waiting for feedback, e.g. from the submitter type: feature suggestion suggesting a new feature

Comments

@deajan
Copy link

deajan commented Jan 14, 2023

Output of restic version

restic 0.14.0 compiled with go1.19 on windows/amd64

How did you run restic exactly?

Target machine: Windows 10 21H2 x64 with local only NTFS filesystem.

"restic.exe" backup "c:\Users" --iexclude-file excludes/generic_excluded_extensions --iexclude-file excludes/generic_excludes --iexclude-file excludes/windows_excludes --exclude-caches --use-fs-snapshot

I've ran restic as Administrator and as System, both produced the same results.

I've filtered the logs since there are thousands of files, and left a couple of lines per error type:

error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\DOSSIERS ARCHIVES UTILE\réglement intérieur\projet RI.DOC: The media is write protected.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\DOSSIERS ARCHIVES UTILE\réglement intérieur\réglement intérieur modèle.DOC: The media is write protected.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1981.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1982- 66.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1983 - .pdf: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1984- LIFT.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1985_FRUITS.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1986-.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1988-.doc: The cloud operation is not supported on a read-only volume.
error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\Documents\foo\bar\FINANCIIER\factures\Fact 1989_.doc: The cloud operation is not supported on a read-only volume.

What backend/server/service did you use to store the repository?

rest-server 0.11

Expected behavior

The machine I'm backing up is a Windows 10 x64 with nextcloud.
The latter uses virtual file system, just as OneDrive does.
Some files are not present on the disk, and are represented by NTFS reparse points.

Those files can obviously not be backed up, and should thus be automatically excluded by restic.

Actual behavior

restic tries to backup those files, and will fail with error message
The cloud operation is not supported on a read-only volume. or The media is write protected.

Steps to reproduce the behavior

  • Install nextcloud desktop client
  • Synchronize a folder
  • Enable virtual file system support, see fig 1
  • Right click on any synchronized folder, untick "nextcloud/available offline"
    Fig 1
    image
  • Run restic backup for that file

You can also achieve this using Onedrive files or probably google drive or dropbox too.

You can check whether a file is a reparse point with fsutil utility:

fsuti reparsepoint query "c:\Users\<user>\OneDrive - <org> FRANCE"`

Which would output something like:

C:\Users\<user>>fsutil reparsepoint query "c:\Users\<user>\OneDrive - <org> FRANCE"
Valeur de la balise d’analyse : 0x9000701a
Valeur de balise : Microsoft
Valeur de balise : répertoire

Analyser la longueur des données : 0x00000064
Données d’analyse :
0000:  01 00 64 00 46 65 52 70  bf 73 95 b5 60 00 00 00  ..d.FeRp.s..`...
0010:  02 00 09 00 07 00 01 00  58 00 00 00 0a 00 04 00  ........X.......
0020:  5c 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  \...............
0030:  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0040:  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0050:  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  ................
0060:  76 00 00 00      

"Valeur de la balise d'analyse" translates to "Reparse Tag Value", which is the reparse point attribute.

Do you have any idea what may have caused this?

restic sees reparse points as standard files.

Do you have an idea how to solve the issue?

Restic should, as it probably already does for NTFS junctions, detect reparse files and exclude them.
Reparse points are identified by file attributes.
In my example above, Googling 0x9000701a made me find winnt.h file in mingw which has a list of all reparse point types:

In https://github.com/mirror/mingw-w64/blob/master/mingw-w64-tools/widl/include/winnt.h (line 2298) :

#define IO_REPARSE_TAG_MOUNT_POINT      __MSABI_LONG(0xA0000003)
#define IO_REPARSE_TAG_HSM              __MSABI_LONG(0xC0000004)
#define IO_REPARSE_TAG_DRIVE_EXTENDER   __MSABI_LONG(0x80000005)
#define IO_REPARSE_TAG_HSM2             __MSABI_LONG(0x80000006)
#define IO_REPARSE_TAG_SIS              __MSABI_LONG(0x80000007)
#define IO_REPARSE_TAG_WIM              __MSABI_LONG(0x80000008)
#define IO_REPARSE_TAG_CSV              __MSABI_LONG(0x80000009)
#define IO_REPARSE_TAG_DFS              __MSABI_LONG(0x8000000A)
#define IO_REPARSE_TAG_FILTER_MANAGER   __MSABI_LONG(0x8000000B)
#define IO_REPARSE_TAG_SYMLINK          __MSABI_LONG(0xA000000C)
#define IO_REPARSE_TAG_IIS_CACHE        __MSABI_LONG(0xA0000010)
#define IO_REPARSE_TAG_DFSR             __MSABI_LONG(0x80000012)
#define IO_REPARSE_TAG_DEDUP            __MSABI_LONG(0x80000013)
#define IO_REPARSE_TAG_NFS              __MSABI_LONG(0x80000014)
#define IO_REPARSE_TAG_FILE_PLACEHOLDER __MSABI_LONG(0x80000015)
#define IO_REPARSE_TAG_WOF              __MSABI_LONG(0x80000017)
#define IO_REPARSE_TAG_WCI              __MSABI_LONG(0x80000018)
#define IO_REPARSE_TAG_WCI_1            __MSABI_LONG(0x90001018)
#define IO_REPARSE_TAG_GLOBAL_REPARSE   __MSABI_LONG(0xA0000019)
#define IO_REPARSE_TAG_CLOUD            __MSABI_LONG(0x9000001A)
#define IO_REPARSE_TAG_CLOUD_1          __MSABI_LONG(0x9000101A)
#define IO_REPARSE_TAG_CLOUD_2          __MSABI_LONG(0x9000201A)
#define IO_REPARSE_TAG_CLOUD_3          __MSABI_LONG(0x9000301A)
#define IO_REPARSE_TAG_CLOUD_4          __MSABI_LONG(0x9000401A)
#define IO_REPARSE_TAG_CLOUD_5          __MSABI_LONG(0x9000501A)
#define IO_REPARSE_TAG_CLOUD_6          __MSABI_LONG(0x9000601A)
#define IO_REPARSE_TAG_CLOUD_7          __MSABI_LONG(0x9000701A)
#define IO_REPARSE_TAG_CLOUD_8          __MSABI_LONG(0x9000801A)
#define IO_REPARSE_TAG_CLOUD_9          __MSABI_LONG(0x9000901A)
#define IO_REPARSE_TAG_CLOUD_A          __MSABI_LONG(0x9000A01A)
#define IO_REPARSE_TAG_CLOUD_B          __MSABI_LONG(0x9000B01A)
#define IO_REPARSE_TAG_CLOUD_C          __MSABI_LONG(0x9000C01A)
#define IO_REPARSE_TAG_CLOUD_D          __MSABI_LONG(0x9000D01A)
#define IO_REPARSE_TAG_CLOUD_E          __MSABI_LONG(0x9000E01A)
#define IO_REPARSE_TAG_CLOUD_F          __MSABI_LONG(0x9000F01A)
#define IO_REPARSE_TAG_CLOUD_MASK       __MSABI_LONG(0x0000F000)
#define IO_REPARSE_TAG_APPEXECLINK      __MSABI_LONG(0x8000001B)
#define IO_REPARSE_TAG_GVFS             __MSABI_LONG(0x9000001C)
#define IO_REPARSE_TAG_STORAGE_SYNC     __MSABI_LONG(0x8000001E)
#define IO_REPARSE_TAG_WCI_TOMBSTONE    __MSABI_LONG(0xA000001F)
#define IO_REPARSE_TAG_UNHANDLED        __MSABI_LONG(0x80000020)
#define IO_REPARSE_TAG_ONEDRIVE         __MSABI_LONG(0x80000021)
#define IO_REPARSE_TAG_GVFS_TOMBSTONE   __MSABI_LONG(0xA0000022)

If restic already excludes NTFS junctions via an attribute filter, one could add the specific file attributes as filters.

In C, this would look like something along this:

if(*pdwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT)
                sb->st_rdev=WIN32_MOUNT_POINT;
        else
                sb->st_rdev=0;

Sorry, I have no golang knowlegde to help more.
Matybe this could be implemented as part of #3863 which already identifies file types ?

Did restic help you today? Did it make you happy in any way?

I really enjoy restic and decided to invest some time to make it more easy to use for end users.
I am writing a nice backup wrapper and GUI for restic, which adds some missing parts like pre and post backup hooks and a function that makes a new snapshot only if no recent snapshot exists.
This will hopefully go opensource once I ironed out caveats.
Works on Windows and Linux, also has a config gui and a restore gui.
I'd love to have a solution to get rid of those errors.

@deajan
Copy link
Author

deajan commented Jan 15, 2023

Additionnal info, I got a couple of files that cannot be backed up with the following error:

error: open \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\AppData\Local\Microsoft\WindowsApps\Microsoft.DesktopAppInstaller_8wekyb3d8bbwe\python3.exe: The file cannot be accessed by the system.
error: open \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\AppData\Local\Microsoft\WindowsApps\Microsoft.DesktopAppInstaller_8wekyb3d8bbwe\winget.exe: The file cannot be accessed by the system.
error: open \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\User\AppData\Local\Microsoft\WindowsApps\Microsoft.MicrosoftEdge_8wekyb3d8bbwe\MicrosoftEdge.exe: The file cannot be accessed by the system.

Those are also reparse points, (type 0x8000001B says fsutil).
So of course they cannot be backed up, but still generate The file cannot be accessed by the system. errors which is generic enough so I cannot filter that error out of restic logs to know whether backup did succeed or not.

Those files should exist on every vanilla recent Windows 10.

I've also tried current restic 0.15.0 compiled with go1.19.5 on windows/amd64. Same results.

@deajan
Copy link
Author

deajan commented Jan 17, 2023

Checked a bit what go command could detect reparse points.
It seems that os.FileInfo.Mode.IsRegular does the job, see golang/go#42184
Sorry, I'm not a go pro (pun intended), so I cannot help much more.

@MichaelEischer
Copy link
Member

It seems that os.FileInfo.Mode.IsRegular does the job, see golang/go#42184

That method won't help as we cannot just exclude everything which isn't a regular file. Restic definitely should e.g. backup symlinks etc.

Why don't you just exclude the whole Nextcloud folder? After all it is pretty much random which files are stored locally and which are not. Always ignoring files which are not available locally also isn't a good idea, as then someone will be very surprised that certain files are not included in the backup (unless we maybe can detect this exact case reliably).

@MichaelEischer MichaelEischer added category: backup state: need feedback waiting for feedback, e.g. from the submitter platform: windows type: feature suggestion suggesting a new feature state: need direction need key decisions or input from core developers labels Jan 21, 2023
@deajan
Copy link
Author

deajan commented Jan 22, 2023

That method won't help as we cannot just exclude everything which isn't a regular file. Restic definitely should e.g. backup symlinks etc.
[...]
[...] unless we maybe can detect this exact case reliably

Sorry if I misguided here. I've put all the necessary file attributes in my first comment, which allow identifying what file is a junction / symlink or non local file. I'm just not a go programmer.

Why don't you just exclude the whole Nextcloud folder? After all it is pretty much random which files are stored locally and which are not. Always ignoring files which are not available locally also isn't a good idea, as then someone will be very surprised that certain files are not included in the backup (unless we maybe can detect this exact case reliably).

I'm trying to build a solution that works on most Windows computers, without the need to fine tune for every nextcloud/onedrive/dropbox/whatever cloud service.
As of today, I use a regex on restic output to filter cloud file errors and decide whether a backup is bad or not. This works but is not ideal in terms of bare performance.

Always ignoring files which are not available locally also isn't a good idea,

I agree, this could become an option like --ignore-cloud-files or so, so users may decide.

PS: My tool is quite prime time ready (did the internationalization today). It's basically restic + GUI+ prometheus support + broad exclusion lists + secure yaml config, all compiled into single executables for Windows and Linux.
I've been using this internally for a couple of months now, I'll publish a git repo probably this week.
Where should I post about it ?

@deajan
Copy link
Author

deajan commented Aug 28, 2023

As far as my current fix goes, it's horrible to parse restic output since it will give localized error messages like:

error: read \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\Users\user\Schéma réseau.dat: L'opération de cloud n'est pas prise en charge sur un volume en lecture seule.
error: read \\?\c:\Users\user\desktop.png: Le fournisseur de fichier cloud n’est pas en cours d'exécution.

I understand that the localized part is not restic generated but is a windows 'feature', which cannot be changed as in linux with LOCALE=C or so, but this is getting really painful to programatically know if restic backup succeed or failed.
It's getting worse I think if the error message is in non latin alphabet, there will be no way to know if the error is "bad" or not.

Is there any chance restic is going to be able to detect Windows reparse points ?

@jredfox
Copy link

jredfox commented Dec 17, 2023

not all reparse points are shortcuts. Here is a list of reparse points produced by any modern windows pc account onedrive online only or savefiles only to pc with onedrive still installed with folder. For my program ExternalWIN I decided if it has a reparse point and is a directory check it's reparse point id and if it's a onedrive related id then it's safe to recurse in else I assume it's a link of some sort. I noticed at least on my recent tests that IO_REPARSE_TAG_CLOUD_6 occurs the most so that's why it's labeled first.

IO_REPARSE_TAG_CLOUD_6 = 0x9000601A
IO_REPARSE_TAG_CLOUD = 0x9000001A
IO_REPARSE_TAG_CLOUD_1 = 0x9000101A
IO_REPARSE_TAG_CLOUD_2 = 0x9000201A
IO_REPARSE_TAG_CLOUD_3 = 0x9000301A
IO_REPARSE_TAG_CLOUD_4 = 0x9000401A
IO_REPARSE_TAG_CLOUD_5 = 0x9000501A
IO_REPARSE_TAG_CLOUD_7 = 0x9000701A
IO_REPARSE_TAG_CLOUD_8 = 0x9000801A
IO_REPARSE_TAG_CLOUD_9 = 0x9000901A
IO_REPARSE_TAG_CLOUD_A = 0x9000A01A
IO_REPARSE_TAG_CLOUD_B = 0x9000B01A
IO_REPARSE_TAG_CLOUD_C = 0x9000C01A
IO_REPARSE_TAG_CLOUD_D = 0x9000D01A
IO_REPARSE_TAG_CLOUD_E = 0x9000E01A
IO_REPARSE_TAG_CLOUD_F = 0x9000F01A
IO_REPARSE_TAG_ONEDRIVE = 0x80000021
IO_REPARSE_TAG_CLOUD_MASK = 0x0000F000

@deajan
Copy link
Author

deajan commented Jan 15, 2024

@jredfox Thanks, that's basically the filtered version of list I offered above, which makes things easier.
Still not a go guy, so I don't really get where go identifes file types, but here's a discussion about detecting reparse points that could be interesting, if adopted by golang: golang/go#61893 (comment)

Any chances restic will get an (opt in) parameter to ignore cloud reparse points in the future ?
This would make things really easier on Windows side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: backup platform: windows state: need direction need key decisions or input from core developers state: need feedback waiting for feedback, e.g. from the submitter type: feature suggestion suggesting a new feature
Projects
None yet
Development

No branches or pull requests

3 participants