Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot find docs for the exclude file syntax #1005

Closed
LaszloHont opened this issue Jun 8, 2017 · 19 comments
Closed

Cannot find docs for the exclude file syntax #1005

LaszloHont opened this issue Jun 8, 2017 · 19 comments

Comments

@LaszloHont
Copy link

I can't seem to find any docs on how the exclude file syntax is parsed.

i.e. does it support wildcards? regex? how does it differentiate between files and directories? are path prefixes needed? where from (cwd/or root?)?

Some examples:

.qiv-trash (directory that could be anywhere on the filesystem)
.DS_Store
lost+found/
._*
desktop.ini
Thumbs.db (file that could be anywhere on the filesystem)
.Trash-* (the asterisk could be any number, is it needed?)
.tmp$ (file ending in .tmp)
~$ (file ending in a tilde)
~/.cache/ (cache directory in user home dir, using tilde syntax)
/full/path/to/directory/.syncthing/index*
@fwilhe
Copy link
Contributor

fwilhe commented Jun 8, 2017

Hi, have you seen

Patterns use filepath.Glob internally, see filepath.Match for syntax. Additionally ** excludes arbitrary subdirectories. Environment-variables in exclude-files are expanded with os.ExpandEnv.

in https://github.com/restic/restic/blob/master/doc/manual.rst?

I think this should answer your questions.

@LaszloHont
Copy link
Author

I went to the documentation at https://restic.readthedocs.io/en/latest/manual.html and searched for exclude, where it unfortunately doesn't include any common examples (or a reference to the other documentation at manual.rst) :/

@LaszloHont
Copy link
Author

I read the golang doc and I think the end-user (me!) isn't going to know what restic is comparing an exclude to internally - is it the full path (e.g. /home/me/blah) or a path from the repository root (/blah or blah), or relative to the cwd (me/blah when I am at /home)?

@fd0
Copy link
Member

fd0 commented Jun 8, 2017

Thanks for raising this issue, I think you have a valid point. The manual should explain the exclude filters without referencing godoc.org, and more examples are necessary.

@fd0
Copy link
Member

fd0 commented Jun 8, 2017

To answer a few of your questions already:

  • All patterns are tested against the full path of a file/dir to be saved
  • Relative paths/patterns will match anywhere below the path to be saved
  • At the moment there's no way to distinguish between a file and a directory, so --exclude foo will exclude any files and directories named foo. The same goes for --exclude foo/.

From your excludes file:

  • ._* will match all files and directories which name starts with a dot and an underscore
  • desktop.ini will match all files called desktop.ini exactly. So desktop.ini.bak is not excluded and saved in the snapshot.
  • .Trash-* excludes files/dirs named .Trash-, .Trash-foobar, etc.
  • .tmp$ excludes all files/dirs literally named .tmp$, that is a dot, followed by tmp, followed by a dollar sign. No regexp expansion.
  • ~$ excludes all files/dirs literally named tilde dollar. For excluding all files/dirs ending in a tilde, use *~.
  • ~/.cache excludes the directory .cache in all dirs called tilde. For excluding the cache directory in your home directory only, use $HOME/.cache (tilde is not expanded, environment variables are, but only in a file read via --exclude-file, in the command-line the shell expands both).
  • /full/path/to/directory/.syncthing/index* excludes all things with names starting with index below /full/path/to/directory/.syncthing.

@LaszloHont
Copy link
Author

LaszloHont commented Jun 9, 2017

Thanks @fd0

So with "current directory", you don't mean the directory I was in when I launched the backup, but the directory that restic is currently in examining the files (apart from excludes that begin with a slash). Got it.

The behaviour for files and directories is slightly unexpected, I would have expected --exclude foo/ to backup the directory but not the contents, whereas --exclude foo to backup neither. Not sure why, from rsync I guess.

My examples missed an important one: spaces! I guess I need to escape those and shell metacharacters with a backslash.

I ended up copying lots of these ones: https://gist.github.com/jult/e2eaedad6b9e29d95977fea0ddffae7d

Are comments allowed in the excludes file? Edit: c796d84 looks like a hash is the comment character.

@fd0
Copy link
Member

fd0 commented Jun 9, 2017

Ah, I'm afraid that's still not completely correct. I'll describe how restic evaluates the exclude patterns. Let's suppose that restic is run by a user in his home directory (/home/user) like this:

$ restic backup --exclude='*.bak' --exclude='/home/user/secret' --exclude='extra' ~

Then restic will see the following command line arguments (after expansion by the shell):

["restic", "backup", "--exclude='*.bak'", "--exclude='/home/user/secret'", "--exclude='extra'", "/home/user"]

Then, it starts traversing /home/user. The following list describes what happens when the named file/dir is seen. restic always tests the complete path against the patterns:

  • file /home/user/foo.bak: The pattern *.bak matches and the file is not saved. The pattern is not absolute so it matches everywhere for all files ending in .bak.
  • dir /home/user/secret: The absolute pattern /home/user/secret matches, so the dir is not saved and not traversed
  • dir /home/user/foo/home/user/secret: No pattern matches, so the dir is saved.
  • dir /home/user/work/extra: The pattern extra matches, the dir is not saved.

I hope that this is a bit clearer now, I'll add a section to the manual describing the process. The key take-away point is that the patterns are evaluated against the full path of the files during backup. So if you want to match a single directory, use the complete path, otherwise it may match several times somewhere.

Any further questions? :)

@LaszloHont
Copy link
Author

I have further questions, and I very much appreciate the time you are taking to answer them. It's really one question about automagic anchoring of patterns. I think I can guess the answer (we rely on the absolute path being fairly unique and giving us the behaviour we want), but it's best to ask and be sure.

Would the absolute pattern /home/user/secret match /home/user/secret2? (What if you don't want it to?)
Would the absolute pattern /home/user/secret match /home/user/somemount/home/user/secret?

@fd0
Copy link
Member

fd0 commented Jun 9, 2017

In both cases: No, the pattern won't match.

@LaszloHont
Copy link
Author

LaszloHont commented Jun 9, 2017

Why is that? Edit: I'm pleased that it doesn't, but I don't see why that is :)

@fd0
Copy link
Member

fd0 commented Jun 10, 2017

The matching code there is modeled after what a shell would do: If you'd ask yourself, if the file /home/user/secret2 exists, what would ls /home/user/secret print (provided the file secret does not exist)?

In more formal terms: If the pattern starts with a / it is absolute and the pattern must match at the beginning of the string under test, so pattern /home/user/secret does not match /home/user/somemount[...]: The pattern is not a prefix of the string.

You can imagine for yourself that the pattern and the file path are both split into their respective components:

  • /home/user/secret is split into [ROOT, "home", "user", "secret"] and the file /home/user/somemount/home/user/secret is split into [ROOT, "home", "user", "somemount", "home", "user", "secret"]. The string ROOT is used in this example to mark the root directory. You can see that the pattern is not contained in the file name.
  • Let's look at the file /home/user/secret2, wich is split into [ROOT, "home", "user", "secret2"]. Again you can see that the pattern is not contained in the file name.
  • For the file /home/user/secret/secret.txt, which is split into [ROOT, "home", "user", "secret", "secret.txt"] you can see that the pattern is indeed contained in the file name, right at the beginning: [ROOT, "home", "user", "secret", ...], therefore the pattern matches and the file is excluded.
  • Let's say we have a relative exclude pattern of secret/secret.txt, which is split into ["secret", "secret.txt"]. You can see that this pattern can be found in the list for the file /home/user/secret/secret.txt, starting at offset 3: [ROOT, "home", "user", "secret", "secret.txt"], so the pattern matches.

When you have wildcards (*, ? and so on) in a path component, they are also tested. So for your first example, a pattern of /home/user/secret* would match the path /home/user/secret2.

@fd0
Copy link
Member

fd0 commented Jun 10, 2017

All these examples should be document in the manual I think.

@LaszloHont
Copy link
Author

Gotcha. Thanks.

@pkpowell
Copy link

pkpowell commented Aug 6, 2017

Are negative excludes possible ala .gitignore?
say I want to exclude all content in directories named .meteor except for the nested dir .meteor/local/db, could I do this?

/etc/restic/excludes:

.meteor/
!.meteor/local/db

restic backup exclude-file=/etc/restic/excludes

@fd0
Copy link
Member

fd0 commented Aug 6, 2017

No, that is not implemented yet.

@fd0
Copy link
Member

fd0 commented Sep 29, 2017

Documenting include/exclude examples is tracked in #396, I'm closing this issue here.

@karthikpaidi
Copy link

karthikpaidi commented Sep 11, 2019

@fd0 i am trying exclude paths like below jobs//jobs//builds/**/archive to exclude archive dir from all directories would that work. And i have multiple paths like this which i need to exclude, if possible can you suggest me the best way to deal this kind as i cannot find any example for such kind in the documentation

@OvaisTariq95
Copy link

how can i exclude multiple files with with one flage?
i.e
--exclude=foo/bar/t.txt foor/bar2/1.txt

@MichaelEischer
Copy link
Member

You have to specify the excludes as --exclude foo/bar/t.txt --exclude foor/bar2/1.txt Unless foo/bar/t.txt foor/bar2/1.txt is a single filename. Or use an exclude file as described in https://restic.readthedocs.io/en/stable/040_backup.html#excluding-files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants