Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The specification of the call syntax of the native PowerShell command line prompt is not consistent. #15893

Closed
5 tasks done
ArnoCan opened this issue Aug 9, 2021 · 9 comments
Labels
Resolution-Duplicate The issue is a duplicate.

Comments

@ArnoCan
Copy link

ArnoCan commented Aug 9, 2021

Prerequisites

Steps to reproduce

This issue is technically related to #15888, #15889, and #15892.

The specifications and definitions are:

  1. The pure evaluation from the PowerShell prompt behaves as stated in about_Quoting_Rules and uses the syntax of the domain PowerShell-7.1 - B. Grammar

  2. The call of a PowerShell script behaves the same way.

  3. The call of an executable such as another PowerShell instance leaves the PowerShell syntax domain and passes the DOS/cmd.exe syntax domain before entering the PowerShell syntax domain again.

    PS> pwsh.exe        powershell-script.ps1  <command-line-parameters>
    PS> powershell.exe  powershell-script.ps1  <command-line-parameters>
    

    This is for example the case for:

    PS> powershell.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "\""a"
    ['"a', ]
    PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "\""a"
    ['"a', ]
    PS>
    

    where the DOS in-string escape character backslash is successfully applied.

    The backslash is not specified in the grammar PowerShell-7.1 - B. Grammar by escaped-character.

    Same with Python:

    PS> C:\Python371\python.exe -c "import sys;print(sys.argv[1:])" "\""a"
    ['"a']
    PS>
    

    The latter leads me to the assumption, that the initial execution is actually performed in a DOS/cmd.exe like environment, thus escaped and quoted in accordance to the DOS/cmd.exe command line syntax rules. But I called a PowerShell executable at a PowerShell prompt. So I do not expect an intermediary DOS exec call with it's native command line syntax, which has significant differences. Some cases could even not be realized due to the partially contrary/non-compatible specifications.

This makes it difficult to scan/parse a raw call string statically. Because in each case the actuall call chain has to be determined.

The command line call syntax should not change during the call process. At least it should be protected and/or transformed appropriately and passed transparently to the target executable.

The user should have to apply one command line syntax only - PowerShell.
The number of the evaluations of the input command line should be determined and fixed - at best 1x for each call-level.

Expected behavior

PS> powershell.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "`"a"
['"a', ]
PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "`"a"
['"a', ]
PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "\""a"
['\a']

Actual behavior

PS> powershell.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "`"a"
['a', ]
PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "`"a"
['a', ]
PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "\""a"
['"a']

Error details

No response

Environment data

Name                           Value
----                           -----
PSVersion                      7.1.3
PSEdition                      Core
GitCommitId                    7.1.3
OS                             Microsoft Windows 10.0.19042
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Visuals

No response

@ArnoCan ArnoCan added the Needs-Triage The issue is new and needs to be triaged by a work group. label Aug 9, 2021
@vexx32
Copy link
Collaborator

vexx32 commented Aug 9, 2021

I called a PowerShell executable at a PowerShell prompt. So I do not expect an intermediary DOS exec call with it's native command line syntax, which has significant differences. Some cases could even not be realized due to the partially contrary/non-compatible specifications.

Is this a PowerShell issue or a Windows one? I don't know of any way to invoke a process on WIndows while bypassing the typical command line syntax. 🤷

Also, looking at the other issues you filed, the problem you're seeing looks just about the same between all of them, with only slight differences in how it presents, so I'm not sure what you're hoping to achieve other than confusing folks by filing them all separately. 😕

@ArnoCan
Copy link
Author

ArnoCan commented Aug 9, 2021

The main confusion of the input and result are:

PS> .\pwsh.exe -noprofile -executionpolicy bypass -file  X:\print_argv_list.ps1  "\""a"
['"a']

Because the backslash is not defined by the PowerShell syntax specification as escape character, see escaped-character. But the backtick only.

I don't know how the internal call structure is. So I cannot say where the internal issue for the call is.

The only thing I know here is that the backslash is defined by DOS/cmd.exe as an in-quote escape character. Because the caret does not work within double quotes.

C:\Python371>python.exe -c "import sys;print(sys.argv[1:])" "^"a"
['^a']

C:\Python371>python.exe -c "import sys;print(sys.argv[1:])" "\"a"
['"a']

C:\Python371>

Thus I assume that a DOS based evaluation and execution is proceeded.

I tried some calls for example with DOS based escaping, but failed due to the required biased quotes of the PowerShell, thus was not able to pass all my trial strings through to the DOS based call. Therefore I had to add even in the above call an extra double quote.

"\"a" => "\""a"

Therefore I did not managed to pass e.g. the following string due to the non-balanced double qoute error of the PowerShell:

"^"a"

The alternative:

'^"a'

passed, but did not work as expected.

This is why I assume that a simple update of the specification will eventually not suffice for all scenarios.

@237dmitry
Copy link

Therefore I did not managed to pass e.g. the following string due to the non-balanced double qoute error of the PowerShell

 PS > pwsh -f .\1.ps1 '^"""a'
['^"a']

 PS > gc .\1.ps1
"['$args']"

@jborean93
Copy link
Collaborator

jborean93 commented Aug 9, 2021

There are multiple levels you need to take into account that aren't specifically PowerShell related.

  • PowerShell's parser to create a string
  • PowerShell's logic when calling a native executable
    • This is the logic that PowerShell uses to convert the arguments that it sees when invoking a native binary
    • e.g. my.exe $arg1 "arg2" ...
    • It essentially reads each value as positional arguments and generates a list of strings to then pass to the native API
    • On Windows there is no native API to invoke an executable with a list of arguments so PowerShell has logic to convert that list to a single string
    • There are numerous issues around this Arguments for external executables aren't correctly escaped #1995
    • 7.2 is implementing some features that try to make this easier to deal with some some of this logic is on shifting ground
    • In a general sense on Windows, PowerShell will enclose an argument in double quotes but it does not escape the inner double quotes if present
    • my.exe foo "bar hello" '{"foo": "bar"}' becomes the following command line value my.exe foo "bar hello" "{"foo": "bar"}"
  • The platforms logic when setting argv of the new process
  • The new process' logic when reading argv
    • A process could use the argv passed into it or it could have it's own rules and just reads the command line string directly
    • There's nothing much PowerShell can do about this as the behaviour is entirely up to the binary being called

Using my example above say you were to do the following in PowerShell (on Windows at least)

$nativeArgs = @('argument1', "argument 2")
$bar = "value 123"

my.exe $nativeArgs foo "bar $bar" '{"foo": "$bar"}'

What happens is the following:

  • PowerShell parses the line $nativeArgs ... and creates an array of strings based on the standard PowerShell quoting rules
  • PowerShell parses the line my.exe ... and sees it is calling a native binary and it has positional arguments
  • These positional arguments are read just like positional arguments to a cmdlet and are flattened down to 2 array
    • $nativeArgs already contains an array of strings so no extra parsing there - becomes the args arguments1, argument 2
    • foo is a bare positional argument and is treated as a string - becomes the arg foo
    • "bar $bar" is a double quoted value so is treated as a string and embedded vars are "templated" in - becomes the argument bar value 123
    • '{"foo": "$bar"}' is a string value inside single quotes so no variable templating applies - becomes the argument {"foo": "$bar"}
  • PowerShell takes the list of arguments and wraps them in double quotes if they contain whitespace (space/newline/etc)
    • argument1 has no space so is kept as argument1
    • argument 2 has a space so it's changed to "argument 2"
    • foo has no space so is kept as foo
    • bar value 123 has a space so it's changed to "bar value 123"
    • {"foo": "$bar"} has a space so it's changed to "{"foo": "$bar"}"
  • The list of arguments are added together with spaces and invoked on the C API
    • The command becomes "my.exe" argument1 "argument 2" foo "bar value 123" "{"foo": "$bar"}"
  • Now the native binary starts and goes to read the command line
  • For the first everything works fine until it gets to "{"foo": "$bar"}"
    • The argument here ultimately becomes {foo: $bar} which is obviously not correct
    • To fix this in PowerShell you would have to escape the inner double quotes manually and do my.exe $nativeArgs foo "bar $bar" '{\"foo\": \"$bar\"}'
    • Escaping for this uses \" as the escaping happens in this layer and not PowerShell
    • This is the type of things that are targeted towards fixing in 7.2
    • The key thing to note here is the escaping done here is not escaping for PowerShell but rather escaping for this layer
    • Essentially the escaping logic for PowerShell doesn't apply
  • For the 2nd there's no golden rule - it's entirely up to the underlying application

Ultimately there is no one size fits all and it's made complex by the many different layers that are involved

Describes rules for using single and double quotation marks in PowerShell.

@ArnoCan
Copy link
Author

ArnoCan commented Aug 10, 2021

@237dmitry
I tried to pass the string literal:

^"a

to the native layer with the expected literal result - from the native layer:

"a

Which may have a different call context than the call of a native PowerShell script - see also @jborean93.

Anyhow, this was just a simple trial to understand the instances of the call chain.

@ArnoCan
Copy link
Author

ArnoCan commented Aug 10, 2021

@jborean93
Thanks for the comprehensive description. This is exactly the challange.

My focus on the syntax specification in external code like Python is matched in particular by:

The new process' logic when reading argv

A process could use the argv passed into it or it could have it's own rules and just reads the command line string directly

This is what I am scanning/lexing/tokenizing online and offline - argv and/or any complete or partial command line string. Therefore I require in particular the complete normative information about quoting and escaping.

Spoken in terms of this issue - the missing parts of the syntax specification:

  • The resulting syntax of native calls requires PowerShell-7.1 - B. Grammar plus the syntax specification of the native native platform subprocess call environment. The latter seems basically not available.

  • The syntax of the native native platform subprocess layer should be at least referenced by the PowerShell specification. See
    PowerShell-7.1 - C. References

@ArnoCan
Copy link
Author

ArnoCan commented Aug 10, 2021

@vexx32
The issues are related all to the command line processing and the call of external subprocesses, in particular Python based.
But target actually different issues:

The tasks themself are a bit confusing, therefore I separated them in order to clearly distinguish and define the concrete issues case by case.

@rjmholt
Copy link
Collaborator

rjmholt commented Aug 12, 2021

This issue has been treated pretty extensively in #1995.

There's also discussion in #13068.

Essentially, from #1995 (comment), the Windows process creation API requires the command line as a single string rather than an array, and then parses that with CommandLineToArgvW. This all means that PowerShell is forced to either make a breaking change in how it parses the command line or propagate bad transformations through the .NET and Windows APIs.

Given the extensive discussion of this in pre-existing issues, I'll mark this and the other issues as duplicates of #1995.

@rjmholt rjmholt added Resolution-Duplicate The issue is a duplicate. and removed Needs-Triage The issue is new and needs to be triaged by a work group. labels Aug 12, 2021
@ghost
Copy link

ghost commented Aug 13, 2021

This issue has been marked as duplicate and has not had any activity for 1 day. It has been closed for housekeeping purposes.

@ghost ghost closed this as completed Aug 13, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution-Duplicate The issue is a duplicate.
Projects
None yet
Development

No branches or pull requests

5 participants