Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF: vulnerability by command line parsing (considers valid windows escape sequences) #307

Closed
wants to merge 2 commits into from

Conversation

sebres
Copy link
Contributor

@sebres sebres commented Mar 22, 2023

This PR provides a fix for certain vulnerability by command line parsing (considers valid windows escape sequences), like backslashes followed by quote-char as well as quote triples).

Windows command line processor (as well as programs and standard libraries constructing command line from arguments) use special escape and quoting sequences, so...

the proper scenario to parse command line looks like that...
  • arguments are separated by spaces or tabs
  • quotes serve as optional argument delimiters
    "a b" --> a b
  • escaped quotes must be converted back to "
    \" --> "
  • consecutive backslashes preceding a quote (inclusive closing quote for a parameter with spaces) see their number halved with
    the remainder escaping the quote:
    2n backslashes + quote --> n backslashes + quote as an argument delimiter
    2n+1 backslashes + quote --> n backslashes + literal quote
  • backslashes that are not followed by a quote are copied literally:
    a\b --> a\b
    a\\b --> a\\b
  • in quoted strings, consecutive quotes see their number divided by three
    with the remainder modulo 3 deciding whether to close the string or not.
    Note that the opening quote must be counted in the consecutive quotes,
    that's the (1+) below:
    (1+) 3n quotes -> n quotes
    (1+) 3n+1 quotes -> n quotes plus closes the quoted string
    (1+) 3n+2 quotes -> n+1 quotes plus closes the quoted string
  • in unquoted strings, the first quote opens the quoted string and the
    remaining consecutive quotes follow the above rule.
this causes following behavior by parsing of command line...
cmd line arg1 arg2
abc" "def abc def
abc\" \"def abc" "def
"abc\" \"def" abc" "def
"abc"" ""def" abc" "def
abc"" ""def abc def
abc""" """def abc" "def
abc\""" \"""def abc" "def
abc\\""" \\"""def abc\" \"def
"abc"""def" ghi" abc"def ghi
"abc"""def" ghi
* missing close qoute at end
abc"def ghi
"abc""def" ghi abc"def ghi
"abc\"""def" ghi abc""def ghi
"abc\\"""def" ghi" abc\"def ghi
"abc\\\"""def" ghi abc\""def ghi
"abc\\\\"""def" ghi" abc\\"def ghi

I know Windows doesn't allow quote-char in file or folder names, but command line could be built by some automation with a foreign user input and if it becomes validated using nominal condition of Windows parser, but hereafter processed in 7zip, as a consequence it could cause:

  • misbehavior since the arguments can be parsed incorrectly (compared to default behavior of Windows parser) in the way that parts of quoted arguments may become unquoted and vice versa, as well as swim between different args;
  • an attacker may create artificial command line that pass validation rules of nominal condition of Windows, but vulnerable here (inject, data steal, etc), because the arguments would deviate between validation and execution phase;
  • (e. g. upon some future enhancements) special tokens like pipe |, ampersand & or redirecting tokens like > that normally included in quoted string (and validated as a string) could abrupt get different meaning and used for piping, redirecting etc, that beside the injection possibility opens still worse attacking vector that can even cause RCE or used to create persistent exploits.

Alternatively one could use windows API CommandLineToArgvW, but this would imply a modifying of SplitCommandLine interface as well as new dependency to shell32.dll.

With this fix SplitCommandLine works now very similar to CommandLineToArgvW (but doesn't require dependency to shell32 library).

… sequences, like backslashes followed by quote-char as well as quote triples) - SplitCommandLine works now very similar to CommandLineToArgvW (but doesn't require dependency to shell32)
…ed copying (don't gather remaining string), don't build args byte by byte
@mcmilk
Copy link
Owner

mcmilk commented Mar 22, 2023

Thank you, pull request rebased and integrated.

@mcmilk mcmilk closed this Mar 22, 2023
@sebres sebres deleted the vulni-cmdline-parse branch March 22, 2023 16:53
@sebres
Copy link
Contributor Author

sebres commented Mar 22, 2023

Thx for review and merge.
May I ask about the reason of rebase? (was something wrong in the commit comments?)

@mcmilk
Copy link
Owner

mcmilk commented Mar 23, 2023

The comments were okay, but the line lengths of them are 65 ...
I also add signed-off ... I should maybe put some skeleton things into .github

@sebres
Copy link
Contributor Author

sebres commented Mar 23, 2023

OK, understand.

By the way, by fixing of that here I thought have been finding similar vulni on ReactOS and wine, just...

It looks like in case of ReactOS and wine (for the function CommandLineToArgvW), one can say there is nothing and it works as expected, but in case of 7-zip, I'm unsure a bit.
Script langs like python, tcl and co follow the same mechanisms like it is currently implemented here, but other programs using CommandLineToArgvW or main would do that differently.

Here is small table illustrating the issue...
command line
(arguments after exe)
CommandLineToArgvW / mainpython, tcl, etc
CA1A2A3CA1A2A3
three"""quotes next2three"quotesnext 1three"quotes next
four"""" quotes" next 4%3=13four" quotesnext4%3=1 2four"quotes next 4%3=1
five"""""quotes next2five"quotesnext 1five""quotes next
seven""""""" quotes" next 7%3=13seven"" quotesnext7%3=1 3seven""" quotesnext7%3=1
twelve""""""""""""quotes next2twelve""""quotesnext 2twelve"""""quotesnext
thirteen""""""""""""" quotes" next 13%3=13thirteen"""" quotesnext13%3=1 3thirteen"""""" quotesnext13%3=1
"two""quotes next2two"quotesnext 1two"quotes next
"two"" next2two"next 1two" next
"three""" quotes" next 4%3=13three" quotesnext4%3=1 2three"quotes next 4%3=1
"four""""quotes next2four"quotesnext 1four""quotes next
"six"""""" quotes" next 7%3=13six"" quotesnext7%3=1 3six""" quotesnext7%3=1
"eleven"""""""""""quotes next2eleven""""quotesnext 2eleven"""""quotesnext |
"twelve"""""""""""" quotes" next 13%3=13twelve"""" quotesnext13%3=1 3twelve"""""" quotesnext13%3=1
"the crazy \\"""\\" quotes2the crazy \"\quotes 1the crazy \"\ quotes

The way how the parsing is implemented in this PR and merged showing the right part of table (like script langs doing it).
If it should be rather like a left part of table (like native windows API / std libs doing it) one'd need to apply this patch:

 CPP/Common/CommandLineParser.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CPP/Common/CommandLineParser.cpp b/CPP/Common/CommandLineParser.cpp
index d07f0f1..6bea75b 100644
--- a/CPP/Common/CommandLineParser.cpp
+++ b/CPP/Common/CommandLineParser.cpp
@@ -51,7 +51,7 @@ static const wchar_t * _SplitCommandLine(const wchar_t* s, UString &dest)
           if (++qcount == 3)
           {
             dest += L'"';
-            qcount = 1;
+            qcount = 0;
           }
         }
         f = s;

Just I'm still unsure which strategy is better/safer, probably must investigate in source code history of python/tcl (at least this two) or interview those community (or code originators).

@mcmilk
Copy link
Owner

mcmilk commented Mar 29, 2023

@sebres - should I release a new version with that fix?
What do you think about this?

@sebres
Copy link
Contributor Author

sebres commented Mar 29, 2023

I'm still unsure about the further amend (with qcount = 0):
Although it is very improbable that someone would use triple-quotes as escape sequence for whatever invocation (it is rather \" sequence, which would work correct now), it could be used anyway (for example to avoid insertion of unclosed quote-char by unpaired quotes in the argument). And this remains a certain inconsistence between default windows notation (if used for validation) and current implementation... Anyway it's important to have proper solution from scratch (first release implementing new parsing rules).
I had no time yet to inspect all the langs (source code history, tests, etc), why those parsing is implemented differently to windows, but at the moment it looks rather like this were historic inconsistence, not an intention. But I'd like to fulfill my research.
Can you wait few days with the release?

sebres added a commit to sebres/7-Zip-zstd that referenced this pull request Mar 30, 2023
compatibility to CommandLineToArgvW or stdlib's main argument parsing
@sebres
Copy link
Contributor Author

sebres commented Mar 30, 2023

OK, regarding different behavior of script-langs, it looks indeed like a historic mistake.
In tcl it was already fixed in newest versions. For python I'd provide a PR later (as well as check whether some other langs are also affected in trunk/mainline).
Thus the correct decision would be to amend the fix as described above (provided in #310).

mcmilk added a commit that referenced this pull request Mar 30, 2023
amend to #307, compatibility to CommandLineToArgvW or stdlib's main argument parsing
mcmilk pushed a commit that referenced this pull request Mar 30, 2023
compatibility to CommandLineToArgvW or stdlib's main argument parsing

Signed-off-by: Sergey G. Brester <serg.brester@sebres.de>
sebres added a commit to sebres/7-Zip-zstd that referenced this pull request Apr 6, 2023
…of backslash and quotes);

fixes mcmilkgh-312 for new command-line parser algorithm (mcmilkgh-307, mcmilkgh-310)
mcmilk pushed a commit that referenced this pull request Apr 14, 2023
- escaping of backslash and quotes
- fixes gh-312 for new cmdline parser algorithm (gh-307, gh-310)

Signed-off-by: Sergey G. Brester <info@sebres.de>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
mcmilk pushed a commit that referenced this pull request Apr 14, 2023
- escaping of backslash and quotes
- fixes gh-312 for new cmdline parser algorithm (gh-307, gh-310)

Signed-off-by: Sergey G. Brester <info@sebres.de>
Reviewed-by: Tino Reichardt <milky-7zip@mcmilk.de>
sebres added a commit to sebres/7-Zip-zstd that referenced this pull request Sep 7, 2023
…of backslash and quotes);

fixes mcmilkgh-312 for new command-line parser algorithm (mcmilkgh-307, mcmilkgh-310)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants