Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve ssh filter btrbk.sh #511

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

calestyo
Copy link
Contributor

The goal of this branch is to overhaul ssh_filter_btrbk.sh, especially far more hardening respectively tightening what is allowed to be executed, but also to add some mode for btrbk’s raw target mode.

In strings that don’t contain `'` nor do any expansions, use single quotes to
avoid any future unintended expansions or escapes.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Double quote any variable expansions that might ever contain field separators.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
ssh_filter_btrbk.sh is mainly intended for security purposes and should
therefore itself be written with that in mind.
It is written for bash, which greatly extends the POSIX Shell Command Language
and is incompatible with it in some niche cases.

For several reasons, it seems a good idea to convert the program to (mostly)
pure POSIX Shell Command Language:
• People may try to use the program with other shells (for example when bash is
  not available) and make errors.
More pure `sh` implementations like dash …
• … have far less code and also less dependencies, which possibly also reduces
  the chance for bugs or exploits,
• … are less dynamic in development (and have thus possibly a lower chance of
 incompatible changes) …
• … and may run faster.

This commit replaces any unnecessary “bashishms” with purely POSIX compatible
code, with the exception of the `local`-built-in, which is however supported by
most POSIX compatible shells (including dash, klibc-utils’s `sh` and BusyBox’
`sh`) in some way.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
This commit finishes the work from the previous one and converts
ssh_filter_btrbk.sh to (mostly) pure POSIX Shell Command Language.

Instead of bash’s `=~`-operator for its `[[ … ]]`-compound-command it uses
`grep`.
At the time of writing, bash has at least the `nocasematch`-shell-option which
would have a negatve security impact for this program. While it’s not enabled
per default single users could potentially change that, not realising the
consequences.
Thus, moving away from this may also provide some hardening.

Unlike bash’s `=~`-operator, which matches against the whole string at once,
`grep` matches the pattern against each line of input.
This would allow for attacks by including a newline in the SSH command like in:
    SSH_ORIGINAL_COMMAND="readlink /dev/stdout
    cat /etc/shadow"
but is prevented by the general exclusion of newlines in commit TODO.

`grep` may return an exit status of `0` when used with its `-q`-option, even
when an error occurred.
Since this program is intended specifically for security purposes this shall be
avoided, even if such case is unlikely, and therefore its standard output and
standard error are redirected to `/dev/null` instead.

Further, using just:
    local formatted_restrict_path_list="$(printf '%s' "$restrict_path_list" | sed 's/|/", "/g')"
rather than:
    local formatted_restrict_path_list=""; formatted_restrict_path_list="$(printf '%s' "$restrict_path_list" | sed 's/|/", "/g')"
prevent `set -e` to take effect if the pipeline within the command substitution
fails, as the returned exit status of the whole command is the result of
`local`, not that of the assignment.
This is however no security problem here, as `formatted_restrict_path_list` is
only used for informative pruposes.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
In spirit, POSIX considers `echo` rather obsolete (it was just kept because of
its widespread use).

It’s also not possible to use `echo` portably unless it’s `-n`-option (as the
first argument) and escape sequences are omitted.
While neither was the case here, it’s better style to just always use `printf`
in order to avoid any future confusion when both are used.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
POSIX designates a number of exit statuses around `127` for special use and GNU
a few further.
Further, using values >127 is rather uncommon for normal use-cases.

Use `1` when the SSH command was rejected and `2` when the program’s arguments
could not be parsed).

Since this might at least in principle be used by 3rd-party programs, added a
specific note to the changelog.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
OpenSSH’s environment variable `SSH_CLIENT` has been deprecated in upstream
commit f37e246f858cdd79be4f4e158b7b04778d1cb7e9 (2002-09-19) and replaced by
`SSH_CONNECTION`.

Both contain more than just the remote information, thus adapted the log message
to reflect that.

Since this might be used by 3rd-party programs (like fail2ban), added a specific
note to the changelog.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
• Set shell options in one command.
• Homogeneously use local variables for function positional parameters in all
  places.
• In redirections, omit `1` for standard output.
• Homogeneously use `if`-compount-commands instead of `[ … ] && …` in all
  places.
• Homogeneously use curly brackets with parameter expansion.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
• In principle the special `IFS`-variable could be set to some unexpected non-
  standard value.
  Unsetting it causes its default to be used.
• Locales and in particular their characters sets are quite complex in POSIX and
  may have many subtle implications.
  For example, the pattern matching notation (used in `case`-compound-commands
  or some forms of parameter expansion) are in principle only defined for
  character strings. While some shells handle it gracefully, the behaviour is
  undefined if, for example, the character set is UTF-8 and a variable contains
  bytes that do not form valid caracters in that.
  Actually, there are quite some more implications.

  Also, pathnames, in POSIX, are strings of bytes excluding 0x0.

  For these reasons, the locale is set to the `C`/`POSIX`-locale.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
@calestyo
Copy link
Contributor Author

@digint This branch is still far from being ready... and I guess I'll do some big overhaul of the command matching as well.

Three questions that popped up from my side are:

  • Is there a list of all possible exact commands, that btrbk uses (especially in the current version)[0]?
  • Would you mind if we drop the functionality of having more than one -p option? Or why is that even necessary in the first place? Wouldn't a user normally need just one pathname, and all his stuff goes below that? And even if there is a use-case, couldn't that be done by the user using more than one SSH key, and via that, selecting different -p directories?

    I ask, because right now, the path is blindly added into the pattern (even if it contains meta-characters). If a user would use a very stupid path like -p ')))|.*|(((' any command would be accepted.

    Allowing multiple paths, requires me to add some ERE quoting, which is actually much more tricky than one might think.
  • With respect to [0]. I'm not so much in favour of keeping the backwards compatible stuff in the code. Makes it just more complex. Wouldn't it be enough to refer users of old btrbk simply to using an old (compatible) ssh_filter_btrbk.sh?

@calestyo
Copy link
Contributor Author

Hey @digint.

Have you had time to think about the above questions? :-)

Copy link
Owner

@digint digint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing, the changes look good overall.

I have moved away from using ssh_filter_btrbk.sh myself (using btrfs-progs-btrbk), and while btrbk is growing it is not easy to support all new features and at the same time stay tiny and secure.

What I'm trying to say: This script has already grown too complex for a simple "allow"-script, and I'm not sure if I should still recommend it. Every line of code here adds possible exploits, and restricting the users capabilities by either btrfs-progs-btrbk or btrfs-progs-sudo is enough for most users (e.g. restrict backup server to read-only progs, or restrict client to write-only).

ssh_filter_btrbk.sh Show resolved Hide resolved
@@ -3,7 +3,7 @@
set -e
set -u

export PATH=/sbin:/bin:/usr/sbin:/usr/bin
export PATH='/usr/bin:/bin'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, some distros still install btrfs-progs to /{usr,}/sbin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that leads me to some more general question:

Right now ssh_filter_btrbk.sh tries to be compatible to all kinds of versions.

I think this makes things unnecessarily complex and lax (since we need to allow more patters/styles) that are actually used.

Would I get your blessing, if we changed the script, so that it effectively only accepts exactly those commands, from the current version (respectively git commit)?

In many scenarios, people will anyway use the same version of btrbk all over their systems.

And even if not, they could simply copy the specific one per case and use that.

Since that one can be selected on a per-SSH-key basis, full control would be possible, whilst greatly simplifying the script.

If you'd agree with that, we could make the PATH configurable via Makefile, or simply leave it the duty of the distribution maintainer, to override if still using old sbin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and also:

I guess most distros will switch to /usr/bin, simply because btrfs can be run as a user, too.

And we don't really need to care too much on ancient versions of some distros:

Those would anyway still ship the old ssh_filter_btrbk.sh, so people could simply use that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess most distros will switch to /usr/bin, simply because btrfs can be run as a user, too.

I hope so too, but these things usually take quite a while to propagate through all distros.

Probably the "correct" way would be not to restrict PATH at all, as this should be done in a layer below (i.e. sshd_config). Note that as some exotic distros like NixOS install executables outside /usr/bin.

We need to find a good balance between security and convenience here. As long as there are distros out there installing btrfs to /sbin/ (notably Gentoo), we need to keep both.

If you'd agree with that, we could make the PATH configurable via Makefile, or simply leave it the duty of the distribution maintainer, to override if still using old sbin.

This is asking for trouble. Are you proposing something like "try to find out where third-party software (btrfs, ...) was installed, and patch a variable according to it" ?

@@ -42,7 +42,7 @@ reject_and_die()
local reason="$1"
log_cmd 'auth.err' 'btrbk REJECT' "$reason"
printf 'ERROR: ssh_filter_btrbk.sh: ssh command rejected: %s: %s\n' "$reason" "$SSH_ORIGINAL_COMMAND" 1>&2
exit 255
exit 1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentionally set to 255, so that btrbk can distinguish between ssh errors and exit codes from the command called by ssh:
https://github.com/digint/btrbk/blob/v0.32.5/btrbk#L994

see ssh(1):

EXIT STATUS
     ssh exits with the exit status of the remote command or with 255 if an
     error occurred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm? I don't quite understand that. You say you want it to differ between ssh and ssh_filter_btrbk.sh errors, but with 255 that’s exactly what wouldn’t work, cause then the program returns that on an errror... as would ssh.

E.g. on an config file error:

$ ssh -o foobar example.org true
command-line line 0: no argument after keyword "foobar"
$ echo $?
255
$ 

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I want to differ between ssh errors and remote command errors (while regarding errors from ssh_filter_btrbk.sh as ssh errors).
I need to know that if e.g. btrfs receive fails, this is not due to some failure of ssh/ssh_filter_btrbk.sh, but is a failure of the btrfs receive command.

ssh_filter_btrbk.sh Show resolved Hide resolved
ssh_filter_btrbk.sh Show resolved Hide resolved

local pathname="$1"

printf '%s' "${pathname}" | sed -E 's%^///+%/%; s%(.)//+%\1/%g; s%/+$%%'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break things. This function should normalize the paths exactly the same way as btrbk does: I believe in some versions btrbk printed // in some weird situations, this might match your implementation but also might not.
I don't think we should sanitize the paths too much, after all it's the user specifying it, I find it convenient to let the user specify whatever he wants here (especially regex).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to fix those places in btrbk instead? btrbk should never issue leading //, cause that's dangerous depending on the OS.

Especially the regexp thingy I find dangerous... it's not documented as that, and easily breaks things. The tool should rather not support all possible weird scenarios, but therefore restrict for 100% sure in all cases for any normal scenario.

@@ -23,6 +23,21 @@ file_match_sane='/[0-9a-zA-Z_@+./-]*' # matches file path (equal to ${file_match
file_match="/[^']*" # btrbk >= 0.32.0 quotes file arguments: match all but single quote
file_arg_match="('${file_match}'|${file_match_sane})" # support btrbk < 0.32.0

is_pathname_absolute()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment, I don't think we should contstrain too much here. This just makes the code more complex, and implies that // has some special meaning for btrbk, which it does not.
A simple check for leading slash might make some sense for crazy users trying to specify relative paths for whatever reason. Even then, the error is just shifted to here, without that check it will fail saying that the path does not match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not have in btrbk, but it may in principle have in the OS... so a OS could choose to handle // not like / but e.g. like and object identifier.

Then all the constraints we set up to make things secure, wouldn't work anymore. Thus I think it's better to simply forbid // at all. In practise no one should anyway ever set that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly btrbk does exactly this in some cases (which are not uncommon), see fix: 142fa6c

This means we should keep supporting leading //, at least before next major release.

Unless it really imposes a security risk, but: I don't know of any linux (!) OS where // is treated in a special way, but maybe you know better. OS other that linux will not install ssh_filter_btrbk.sh, as btrfs is only available on linux as far as I know.

@digint
Copy link
Owner

digint commented Dec 2, 2022

Is there a list of all possible exact commands, that btrbk uses (especially in the current version)[0]?

No. Closest would be the list in ssh_filter_btrbk(1).
Not all features are supported, like btrbk extents. I think it's completely ok to support only the "core" features here.

Would you mind if we drop the functionality of having more than one -p option? Or why is that even necessary in the first place? Wouldn't a user normally need just one pathname, and all his stuff goes below that? And even if there is a use-case, couldn't that be done by the user using more than one SSH key, and via that, selecting different -p directories?

Yes, it's very common to send to multiple paths, e.g. one host is only allowed to send to "/backup/my-host" and "/backup/my-alias", but not "/backup/my-other-secret-host".

I ask, because right now, the path is blindly added into the pattern (even if it contains meta-characters). If a user would use a very stupid path like -p ')))|.*|(((' any command would be accepted.

See my comments in the review, for me it's ok to allow any regex for the , it's the users static configuration after all.

Allowing multiple paths, requires me to add some ERE quoting, which is actually much more tricky than one might think.

Nooo, some ERE quoting would make it completely unreadable. So either leave it as it is (sleek, straight-forward), or find another solution to check the paths (e.g. loop through list, probably adding much complexity as well).

With respect to [0]. I'm not so much in favour of keeping the backwards compatible stuff in the code. Makes it just more complex. Wouldn't it be enough to refer users of old btrbk simply to using an old (compatible) ssh_filter_btrbk.sh?

There are valid use cases to run same script against multiple versions of btrbk, we have to support older versions as well: it's not uncommon to have several clients with different versions of btrbk connecting to same ssh endpoint.

I don't want to keep compatibility forever, but at least it should be compatible to "current" btrbk versions of major distros.

@calestyo
Copy link
Contributor Author

calestyo commented Dec 2, 2022

I'll reply to the rest later, need to catch a train.

What I'm trying to say: This script has already grown too complex for a simple "allow"-script, and I'm not sure if I should still recommend it. Every line of code here adds possible exploits, and restricting the users capabilities by either btrfs-progs-btrbk or btrfs-progs-sudo is enough for most users (e.g. restrict backup server to read-only progs, or restrict client to write-only).

My idea would be to completely revise the current matching/rejection handling.

It's IMO far too complex and thus error prone, to have the overall regexp assembled like this.

My idea is more that there should be one "line" that sets up the regexp for exactly one command... and this also in much stricter fashion, e.g. not allowing any option patterns (like it's done now via option_match) but really just exactly those options that are needed.

Which is also why it would help, to have a list of what may actually occur.

Maybe we could have a phone conference at some point, would perhaps be easier to discuss the more general things.

@digint
Copy link
Owner

digint commented Dec 3, 2022

My idea would be to completely revise the current matching/rejection handling.
It's IMO far too complex and thus error prone, to have the overall regexp assembled like this.

I agree, the regex part is horrible, and has grown along with btrbk. Note that there are quite some subtleties in it, btrbk can assemble pretty complex command structures.

My idea is more that there should be one "line" that sets up the regexp for exactly one command

Which we basically have already with allow_cmd

and this also in much stricter fashion, e.g. not allowing any option patterns (like it's done now > via option_match) but really just exactly those options that are needed.

This is a tradeoff between complexity / maintainability / compatibility and security. As long as the allowed command can not do harm on the filtered path (even with all options mis-used), it should be fine. For others, the options should be well defined, but then tends to get forgotten on a quick fixes.

Which is also why it would help, to have a list of what may actually occur.

Well, the list is in the code ;-)
The goal was, or should be that ssh_filter_btrbk.sh can be read as such a list.

Copy link
Owner

@digint digint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor findings

ssh_filter_btrbk.sh Outdated Show resolved Hide resolved
ssh_filter_btrbk.sh Outdated Show resolved Hide resolved
@digint
Copy link
Owner

digint commented Dec 3, 2022

I pushed improve-ssh_filter_btrbk.sh branch with the commits which I can merge to master for 0.32.6.
Please have a quick look at it, I will have to run some basic tests as well before merging.

The issue left then is how to improve the handling of path name restrictions, without too much incompatibilities.

@calestyo
Copy link
Contributor Author

calestyo commented Dec 3, 2022

I'm not sure how much I can do this weekend... and I haven't had time yet to look at all your comments... (will do later).

Also as said before, I do have a pretty concrete "vision" of how it should look in the end, which depends however a bit on some general questions,... depending on that, some of the already existing commits might even change again.

@calestyo
Copy link
Contributor Author

calestyo commented Dec 4, 2022

I've resolved a few obvious things above,... will answer on the remaining things later.

Previously, pathnames specified via the `--restrict-path`-option had only a
single trailing `/`, if any, stripped.

This commit adds (and utilises) a function which normalises pathnames as
described in its comments.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
This commit adds a function which checks whether a pathname is absolute and
rejects and values to the `--restrict-path`-option which are not.

The idea here is mostly a safeguard for users to prevent accidentally specified
non-absolute pathnames, which would be taken relative to the executing user’s
home-directory.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
@digint
Copy link
Owner

digint commented Dec 27, 2022

Sorry for the delay, I was super busy lately.

I think we can now:

  1. merge everything except the pathname normalization to master, which will go into the next bugfix release v0.32.6
  2. create a separate merge request with the rest, which will break compatibility, along with my fix which removes // (142fa6c). This will then go to v0.33.0.

If you don't yell "stop the presses", I will proceed with (1) and cherry-pick the commits, github should be smart enough to cope with the remaining commits on this MR.

@digint
Copy link
Owner

digint commented Mar 25, 2023

Merged to master (v0.32.6), except:

  • pathname normalization
  • exit code changes

The rest will get merged after more testing (did not get around to it yet...)

@digint digint force-pushed the master branch 5 times, most recently from d10103d to 594fc6d Compare August 5, 2023 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants