New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ArgumentList when invoking native executables #14747
Comments
I suggest to rather have the new behavior by default and use the PowerShell variable rather for opting out of the new behavior. Especially with previews, it will allow people to discover whether they need to adapt their code or to give feedback that would allow tweaking of this new feature. |
Should this not go as an experimental feature as opposed to using an environmental variable to control it? |
How often would that break existing code? |
I think the right approach with respect to enabling the new behavior is:
|
Update: See #15143 for the most current version of this proposal; however, the post below additionally provides some background information on parameter-passing on Windows. As for the proposal:
Unfortunately, things aren't quite so simple on Windows, due to launched processes receiving a command line that they themselves must parse. This very unfortunate design forces processes to take on part of a job that should be exclusively a shell's responsibility: parsing this command line into an array of verbatim arguments. Aside from placing undue burden on non-shell processes, it opens the door to individual processes interpreting their command line however they want: While there is a widely adhered-to convention for how to encode multiple arguments in such a process command line - namely the ones used by Microsoft's C / C++ compiler / the CLR (which I'll call the C convention here) - no single encoding is guaranteed to work with all programs:
It's impossible for PowerShell to fully solve these problems, but it makes sense to make accommodations for these exceptions, so as to make the vast majority of calls just work. For the remaining, edge cases there is:
In concrete terms this means for the new behavior:After PowerShell's own parsing, once the array of verbatim arguments - stripped of
Again, these are reasonable accommodations to make, which:
I invite everyone to scrutinize these accommodations to see if they're complete, overzealous, ... This is a chance to finally cure all native quoting / argument-passing headaches - even if only by opt-in. (To experiment with the proposed behaviors up front (based on my personal implementation that sits on top of the current behavior), you can use |
It should be noted that additional runtime behavior altering preference variables is not without risk. With each one added, the amount of ceremony required to guard against inherited preferences is increased heavily. I'm not weighing in on whether it's the right call in this scenario, just want to make sure that's considered. |
Yes, that is a concern with all preference variables, and the lack of lexical scoping is surely a candidate for #6745 (a similar concern arose recently with respect to allowing opt-in to using The - not exactly obvious - workaround is to use the # Predefined preference variable:
# - defaults to 'Legacy'
# - is NOT defined with option 'AllScope'
$global:PSNativeArgumentPassing = 'Legacy'
& {
# Override the global preference variable *for this scope only*.
$private:PSNativeArgumentPassing = 'Standard'
"In child scope: $PSNativeArgumentPassing"
# Thanks to $private, descendant scopes again see the global variable.
& {
"In grandchild scope: $PSNativeArgumentPassing"
}
} I hope there's no question that:
In light of that, @SeeminglyScience: can you think of a different current-scope-only opt-in mechanism? |
It's a little less of an issue since most of the existing preference variables are only going to change behavior that is already leaking to the user. Like error preference, usually when that changes runtime behavior it's when you're emitting an error to the user. I do say "usually" there because it can for sure end up forcing a different code path, but doesn't change the meaning of your code. Also either way, the more that are added the harder it is to guard against.
Current scope isn't necessarily what you want either, too many things create a new scope where most users would not necessarily expect it. Parser based lexical scoping does solve that problem, but creates a new one in the form of a new dialect which is very expensive in the long term. I don't know of a way to make a change like this without one of these extra chunks of complexity or breaking changes. |
Thanks, @SeeminglyScience - sounds like a preference variable is our only - imperfect - realistic short-term option. (A quick aside: In practice, in terms of guiding users, this means:
|
A good experiment might be to try a whole lot of different build scripts for that use various toolsets and see how often they break with the new setting. That will be a prime example of accidental inheritance with often complex executable arguments. Personally I'd still lean toward the stop parsing sigil being the way to go, |
Unfortunately,
I presume that's because you're a developer rather than a sysadmin or DevOps person. As the following can attest:
the issue at hand is a real, long-standing pain point. And the issue is an even bigger one on Unix, where - unlike on Windows - many capable native utilities exist, and that for performance reasons and due to lack of binary pipeline support resorting to native utilities (including the native shell) is sometimes a must. It all comes back to this striking example (run on a Unix-like platform): PS> /bin/echo '{ "foo" : 1 }'
{ foo : 1 } # !! double quotes were effectively stripped This is such blatantly broken behavior that you can't help but wonder why this hasn't been fixed - even if only on an opt-in basis - in the 14+ years of PowerShell's existence.
|
I mean the new one being proposed in a different issue somewhere.
I'm a sysadmin.
I included that last part as a disclaimer on my opinion, not as a dismissal of the need. Put more casually and a bit exaggerated it would be "I think it should be X but what I do I know, I don't run into this problem". |
I assume you mean the aforementioned #13068 ("native operator") - its purpose is different, requires you to apply a different shell's syntax and, without using a (here-)string as enclosure, is subject to the same conceptual headaches as That said, a new call operator - to be used explicitly, in lieu of Finding the right sigil combination (I don't think a single character is an option), may be a challenge (
Kudos on the extraordinary depth of your programming knowledge (just to be very clear: I mean it).
Understood. I just wanted to complement that with a transpersonal perspective, to leave no doubt that many others do struggle with this. |
I made a mistake (since corrected, and it is correct in the The reason that
Again, what we should strive for is accommodations that:
|
There is another accommodation we need to make (the summary above now links here):
Example: # A command line to pass to cmd.exe for execution.
# If executed correctly, the following should print verbatim:
# Ready to move on [Y,N]?Y
$cmdLine = ' "C:\WINDOWS\system32\choice.exe" /d Y /t 0 /m "Ready to move on" '
# !! Despite the lack of behind-the-scenes escaping of the embedded double quotes,
# !! this currently works *as intended*, in both editions:
cmd /c $cmdLine What PowerShell currently translates the list of arguments to behind the scenes and assigns to cmd /c " "C:\WINDOWS\system32\choice.exe" /d Y /t 0 /m "Ready to move on" " That is, the verbatim content of the string that PowerShell saw was blindly enclosed in overall As it turns out, that's exactly what As a courtesy, we could additionally do the following:
However, that breaks - through # Breaks from both cmd.exe and PowerShell, because both the first and a subsequent argument are double-quoted.
C:[PS]> cmd /c "C:\Program Files\PowerShell\7\pwsh" -noprofile -c " 'hi there' "
'C:\Program' is not recognized as an internal or external command,
operable program or batch file. This could be avoided if PowerShell - in the event that multiple arguments follow That is, PowerShell could automatically transform the above into the following verbatim string assigned to # OK - transformed to single-argument command line, which works robustly - outputs 'hi there'
cmd /c " "C:\Program Files\PowerShell\7\pwsh" -noprofile -c " 'hi there' " " In fact, this is what the # Without `ie`, this breaks.
PS> ie cmd /c 'C:\Program Files\PowerShell\7\pwsh' -noprofile -c " 'hi there' "
hi there |
Ahh that explains why I don't run into this issue much. The few times that I need to invoke an executable with enough argument complication, I tend to build the whole command line as a single string and pass it to |
Addressed as part of |
This issue has been marked as fixed and has not had any activity for 1 day. It has been closed for housekeeping purposes. |
Handling Parameter Binding in Native Executables
PowerShell generally provides a useful experience when working with native executables, but there are a number of issues:
A command line such as:
msiexec /i testdb.msi INSTALLLEVEL=3 /l* msi.log COMPANYNAME="Acme ""Widgets"" and ""Gizmos"""
is received as:msiexec /i testdb.msi INSTALLLEVEL=3 /l* msi.log COMPANYNAME="Acme Widgets and Gizmos"
. A command line such as:msiexec /i testdb.msi INSTALLLEVEL=3 /l* msi.log COMPANYNAME="Acme ""Widgets"" and ""Gizmos."""
is received as:msiexec /i testdb.msi INSTALLLEVEL=3 /l* msi.log COMPANYNAME="Acme Widgets and Gizmos."
which strips the embedded double quotes.Current behavior does not allow empty strings to be passed, so the following is not possible:
useradd -g 501 -u 1001 -p '' sam
emacsclient -u ''
These behaviors should be supported as there are many scenarios where an empty string is required as a parameter value.
Handling Quotes
Windows and Non-Windows have divergent behavior with regard to quotes.
While both Windows and Unix shells recognize
"one two three"
as a single string,Windows does not recognize single quotes
'
as string designator.Unix shells (and PowerShell) has 2 types of strings;
Expandable strings
"$a"
where$a
is expanded with the value of the variable$a, and literal strings
'$a'where the literal string
$a` is passed.In the case of Windows a string such as
'foo bar'
is 2 tokens ('foo
andbar'
) where Unix will see a single stringfoo bar
.Null vs Empty Strings
Most shells don't have the same definition of
null
that PowerShell does.In bash, for example, commands are invoked via strings, so naturally, an empty string can be easily notated with
''
or""
.Similarly, in
CMD.EXE
an empty string may be passed with""
.However, the following should not result in empty strings being added:
$null
$null
$null
Non-null elements of a collection will be bound:
@($null, 1, $null)
will bind a single value1
as a parameter value@($null, 1, $null, 2, $null)
will bind 2 values (1
and2
) as parameter values@($null, 1, $null, '', 2)
will bind 3 values (1
,''
,2
) as parameter valuesexamples:
Additional Examples
Globbing Considerations
Globbing will need to be done where appropriate. This behavior is platform dependent, so on Windows systems, globbing is not performed but is provided on Linux and Mac systems. This is because most utilities on Windows do their own globbing, but on Linux and Mac globbing is done by the shell. The current behavior for globbing does this and needs to remain unchanged to ensure that the scenarios continue to work on each platform. When a glob fails, the string provided shall be sent to the application without alteration.
The following example shows the difference based on platform:
Unsupported or Requiring Alteration
Some elements may not be used as they represent PowerShell tokens.
;
is not allowed. PowerShell will parse this as a statement separator.for example:
msiexec /p msipatch.msp;msipatch2.msp /n {00000001-0002-0000-0000-624474736554} /qb
;
which PowerShell will turn into 2 commands (;
is a statement separator) this string must be quoted.To execute this command, quote the problematic strings or use
--%
after the executable.msiexec /p 'msipatch1.msp;msipatch2.msp' /n '{00000001-0002-0000-0000-624474736554}' /qb
or
msiexec --% /p msipatch1.msp;msipatch2.msp /n {00000001-0002-0000-0000-624474736554} /qb
NB: The behavior for supporting ScriptBlocks as strings could be supported via an allow-list for known executables. This may be needed because the amount of GUIDs (with braces) is used in a number of both Windows and Non-Windows utilities. However, managing the list of utilities may be burdensome.
Improved Tracing
The parameter binding tracing code for native executables is not currently implemented which makes debugging issues when execution native applications very difficult. Tracing for current parameter binding and new behavior shall be provided. In the case of the old style, the path to the executable and the string which makes up the
Arguments
property of the StartInfo object shall be provided. For the new behavior, since the arguments are a list, each element of the list shall be presented.The following transcript shows how the tracing shall appear.
Additional considerations for tracing
It may be desirable to add additional tracing which provides information on the parameters as they were provided.
The tracing above is created at the point where the
StartInfo
object is populated,and it may be useful to see the parameter before it is altered by globbing, etc.
Current Implementation
When PowerShell starts a new native process it takes all the arguments provided and attempts to stitch together the various parts into a single string (which is assigned to the
Arguments
property of theStartInfo
object). This is done with some problematic behavior; empty strings''
are explicitly stripped, embedded quotes and spaces are "lost" and require addition escaping.New Behavior
I think we can do better and reduce the effort and internal complexity when calling native applications. Dotnet has added a new property to the
StartInfo
object calledArgumentList
which allows you to provide the arguments to the command as a collection of strings, alleviating the need to stitch the arguments into a single string. We can take advantage of this new API to reduce the complexity of our code. However, we should maintain backward compatibility if we can, so rather than producing new, breaking behavior via an experimental feature, I suggest that we provide a new runtime behavior based on a PowerShell variable. This allows users to change the behavior without restarting the PowerShell process and can be used when desired.By not changing the default behavior we can provide users an easy way opt-in to the new behavior. Telemetry can be added if desired to capture the count of how many times the current way is used in comparison with this new implementation.
NB: This proposal will actually increase the internal complexity of our code because we'll have 2 ways of calling native applications. Hopefully, this would be temporary and we can use the new APIs exclusively in the future and deprecate the current code.
Tools used
The following utility is used to echo all passed parameters given in the examples above. This does not rely on the CLR runtime, but may be compiled for all platforms.
We can add this to the tools for build if needed (either by binaries checked in for each platform or build)
Related Links:
Pull Requests
ArgumentList
#14692 (the PR to implement the behaviors listed here)Issues
The text was updated successfully, but these errors were encountered: