Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pre-RFC] Argument Escape Migration (Start-Process, &, and the normal cmd starting; v7.1 / v7.2) #13089

Closed
Artoria2e5 opened this issue Jul 3, 2020 · 18 comments
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime

Comments

@Artoria2e5
Copy link

Artoria2e5 commented Jul 3, 2020

Summary of the new feature/enhancement

This proposal seeks to come up with a plan for getting #1995 (argv-passing) right without having to call out to the system shell. A two-step migration plan is proposed:

  • For powershell v7.1, we seek to make a proper -ArgumentList available (in the sense of escaping like System.Diagnostics.Process does) by adding a switch to Start-Process that defaults to false.
  • For powershell v7.2, we make the correct behavior the default by:
    • making the new Start-Process switch default to $true
    • making escaping the behavior for & and normal external command calls
    • bumping whatever protocol version is affected

The goal is to make sure that PowerShell pass the argv as-is to most programs on Windows and all programs on Unix. In other words, for any given string object $x, the result for the two commands are expected to be the same:

Write-Output $x
/bin/printf "%s\n" $x

Win32 programs that use custom cmdline parsing will still be usable with the --% marker as well as our new switch. Unix programs always use the MSVCRT parsing rule under-the-hood in CoreFX, so they won't be negatively affected at all. For Unix, --% can be seen as a way to... poke at the internals.

After all, pwsh is supposed to be a cross-platform shell, so Windows-specific idiosyncrasies should not be part of the basic interface.

Proposed technical implementation details

As some unspecified utility class, we duplicate the PasteArguments class from CoreFX runtime. This is needed because some our logging code paths use startupInfo.Arguments already. It makes more sense to log the stuff escaped like how we invoke it than to log it wrong or using some other custom escape method.

For 7.1 and 7.2, we add a bool? EscapeArgs (name to be discussed) member to the StartProcessCommand class. This flag is to be checked in the existing if (ArgumentList != null) branch. When we decide that we shall escape the args according to the fallback rules in the first section, we do it ourselves using the new utility class.

For 7.2 we change NativeCommandParameterBinder::AppendOneNativeArgument to use the new class. (The existing code tries to do escaping, but it somehow fails. NeedQuotes is miserably wrong.)

I don't have time to check Invoke-Expression and Invoke-Command yet. I guess they are handled by the latter change?

Examples

The change to quoting behavior mainly affects strings with embedded whitespace and double quotes. Consider the snippet below:

$x = 'a" b" c d'
node -e "console.log(process.argv.splice(1).join('|'))" -- "$x"

In current powershell, the string printed would be a|b c d, a result that can only be understood by one with an idea about how MSVCRT parses arguments and how we decide what to wrap quotes around for. In the proposed version, the output would be the a" b" c d, exactly as fed into the command (as Write-Output would show).

The linked PR, #13099, also changes how empty parameters are handled. They were previously omitted on the command-line, but now they are included correctly as empty native arguments. As a result, unassigned variables used will shift the parameter position by one. To avoid this, initalize it to an empty list.

# add $cpflag = @()
if ($shouldReflink) {
  $cpflag = '--reflink=auto'
}
cp -a $cpflag $source $dest

#13099 also enables globbing on ALL bare words. As a result, constructs like a` b* is now subject to globbing. If globbing is undesired, you should be quoting the string. It is expected that few people will rely on the old behavior, as quoted strings are easier to use anyways. A separate proposal (#13098) will address the ergonomics of needing to manually backtick-escaping everything.

A lot of old code is expected to rely on the old example 2 behavior, since that's how POSIX sh works with unquoted expansions. More complex frameworks may do its own quoting, coming into conflict with example 1. An experimental feature switch will be added that covers all three.

@Artoria2e5 Artoria2e5 added the Issue-Enhancement the issue is more of a feature request than a bug label Jul 3, 2020
@iSazonov

This comment has been minimized.

@iSazonov iSazonov added the WG-Engine core PowerShell engine, interpreter, and runtime label Jul 3, 2020
@Artoria2e5
Copy link
Author

Artoria2e5 commented Jul 3, 2020

Since this will be a public interface change, some protocol version bump will needed. I dont really get how the whole versioning thing works, but if we can keep the 7 while bumping the protocol version that will be great too.

Ah shit, there's a dedicated repo for RFCs and I didnt see that. Ooooops. Call this a pre-RFC then.

@Artoria2e5 Artoria2e5 changed the title [RFC] Argument Escape Migration (Start-Process, &, and the normal cmd starting; v7.1 / 8) [RFC] Argument Escape Migration (Start-Process, &, and the normal cmd starting; v7.1 / v7.2) Jul 3, 2020
@Artoria2e5 Artoria2e5 changed the title [RFC] Argument Escape Migration (Start-Process, &, and the normal cmd starting; v7.1 / v7.2) [Pre-RFC] Argument Escape Migration (Start-Process, &, and the normal cmd starting; v7.1 / v7.2) Jul 3, 2020
Artoria2e5 added a commit to Artoria2e5/PowerShell that referenced this issue Jul 4, 2020
* do what PowerShell#13089 says. consolidate and fix cmdline escaping.
* an empty string check. native commands can accept empty arguments!
* attempt wildcard by spaces. now we do it by bareword.
@daxian-dbw
Copy link
Member

Ah shit, there's a dedicated repo for RFCs and I didnt see that. Ooooops. Call this a pre-RFC then.

@Artoria2e5 No you are doing the right thing. Open an issue for discussion is the first step before moving to draft an RFC.
I suggest you to include more examples to demonstrate your proposal.

@TSlivede
Copy link

TSlivede commented Jul 5, 2020

duplicate the PasteArguments class from CoreFX

Just duplicating would probably not be enough. For some edgecases you could probably read my old RFC and the corresponding discussion starting at PowerShell/PowerShell-RFC#90 (comment)

@TSlivede
Copy link

TSlivede commented Jul 5, 2020

Specifically regarding the behavior of Start-Process there is also #5576 posted by @mklement0

And while I do like the Idea of using Start-Process instead of a new operator (suggested here: #13068 (comment)) it would require to make some other features available to Start-Process: AFAIK Start-Process can currently not redirect the out streams of the executable to the next command in the pipeline. And there are probably more differences between Start-Process and the call operator &...

@Artoria2e5
Copy link
Author

Artoria2e5 commented Jul 5, 2020

Wait, there's a three-year-old RFC for this?

Just duplicating would probably not be enough. For some edgecases you could probably read my old RFC and the corresponding discussion starting at PowerShell/PowerShell-RFC#90 (comment)

I want to keep stuff... minimal. My plan is that rogue programs will eventually only be served by --%, for which people will write modules around. It should get documented after all. There is the problem of being unable to pass a literal %, but people should be able to make up their own env var for it. --= is nice and shouldn't be too hard to add.

I cannot think of a good interface for reusable escaping. Doing cmd would take either a very complicated version of the normal rules that keeps tracks of cmd's view of quoting or a non-standard one with the undocumeted "". And top it with the usual /c /k /s "verbatim cmdline here" stuff.

I kind of like the idea of preserving quoting structure, but then a bareword with trailing backslashes will always throw it off. Issue 13098 can however use some inspection if I can figure out what the AST looks like with 'a 'b vs a' b'...

@mklement0
Copy link
Contributor

mklement0 commented Jul 5, 2020

@TSlivede

And while I do like the Idea of using Start-Process instead of a new operator

This is not an either-or scenario: both direct/&-based invocation and Start-Process must be fixed - they serve distinct purposes.

If I understand @Artoria2e5's proposal correctly, the aim with respect to direct/&-based invocation is to fix them in a breaking manner.

However, my understanding is that there is still a fundamental unwillingness to ever take such a big breaking change.
Correct, @SteveL-MSFT?

With this in mind:

Fixing the original features is undoubtedly preferable in an ideal world, but with the commitment to backward compatibility that isn't an option.

@Artoria2e5
Copy link
Author

Artoria2e5 commented Jul 6, 2020

The RFC has an idea about a feature switch variable for controlling the behavior. It should be a good option for maintaining backward compatibility too if we make it scopable.

@mklement0
Copy link
Contributor

@Artoria2e5

The RFC proposes a preference variable as an opt-out mechanism - to opt out of the new, correct behavior, in order to get the old, broken behavior.

A new operator, as awkward as that is, avoids these problems.

Artoria2e5 added a commit to Artoria2e5/PowerShell that referenced this issue Aug 19, 2020
* do what PowerShell#13089 says. consolidate and fix cmdline escaping.
* an empty string check. native commands can accept empty arguments!
* attempt wildcard by spaces. now we do it by bareword.
@jeff-hykin
Copy link

jeff-hykin commented May 9, 2021

For anyone looking for an essentially bulletproof workaround to this issue (that adjusts itself depending on the powershell version.

Comparison

#
# system echo (linux)
#

# broken version
& '/bin/echo' 'hello "world"' 'I am free "finally don''t need to worry about escaping"'       
# >>> hello world I am free finally don't need to worry about escaping

# fixed version
execute '/bin/echo' 'hello "world"' 'I am free "finally don''t need to worry about escaping"' 
# >>> hello "world" I am free "finally don't need to worry about escaping"


#
# powershell's echo
#

# default behavior
& 'echo' 'hello "world"' 'I am free "finally don''t need to worry about escaping"' 
# >>> hello "world"
# >>> I am free "finally don't need to worry about escaping"

# backwards compatible behavior (doesn't double-escape)
execute 'echo' 'hello "world"' 'I am free "finally don''t need to worry about escaping"' 
# >>> hello "world"
# >>> I am free "finally don't need to worry about escaping"

Code

# example: execute 'echo' 'hello "world"' 'I am free "finally don''t need to worry about escaping"' 
# fixes: https://github.com/PowerShell/PowerShell/issues/3049
function execute {
    # 
    # argument processing
    # 
    $all_args = $PsBoundParameters.Values + $args
    
    # 
    # escape all of them
    # 
    $escaped_arg_string = "&"
    $partly_escaped_arg = "&"
    $is_first_arg = $true
    ForEach ($item in $args) {
        # single quotes escape everything... everything except 
        $escaped_item = $item
        $escaped_item = ($escaped_item -replace "'", "''")
        $escaped_twice_item = ($escaped_item -replace '"', '""')
        
        # the command name/path is handled different
        if ($is_first_arg) {
            $is_first_arg = $false
            # powershell doens't escape the double quotes of the command name
            # so we dont double escape them this one time
            $escaped_twice_item = $escaped_item
        }
        $escaped_arg_string = $escaped_arg_string + " '" + $escaped_twice_item + "'"
        # post-7.1
        $partly_escaped_arg = $partly_escaped_arg + " '" + $escaped_item + "'"
    }
    
    # all this because sometimes its an array, sometimes its not
    $results = (Get-Command -CommandType "All" -Name $args[0] -ErrorAction SilentlyContinue).Source
    if ($results -is [array]) {
        $results = $results[0]
    }
    $main_item_source = $results
    
    # all this because sometimes its an array, sometimes its not
    $results = (Get-Command -CommandType "Application" -Name $args[0] -ErrorAction SilentlyContinue).Source
    if ($results -is [array]) {
        $results = $results[0]
    }
    $main_application_source = $results
    
    # if on older version, note: https://github.com/PowerShell/PowerShell/issues/13089
    # and if it is calling an Application (not a script/function)
    if (($PSVersionTable.PSVersion) -lt ([version]"7.2.0") -and ($main_item_source -eq $main_application_source)) {
        # use the fix
        return (Invoke-Expression $escaped_arg_string)
    # if on newer version or not calling an Application
    } else {
        return (Invoke-Expression $partly_escaped_arg)
    }
}

This should always escape the arguments correctly, even when taking into account the planned 7.2 change here.

@mklement0
Copy link
Contributor

mklement0 commented May 9, 2021

@jeff-hykin, if you're looking for something prepackaged that works robustly and also incorporates the important accommodations on Windows proposed in #15143 that the new-in-PowerShell Core 7.2.0-preview.5 experimental PSNativeCommandArgumentPassing feature does not and seemingly will never include (though it does work robustly on Unix), consider the ie function from the Native module (authored by me).

@jeff-hykin
Copy link

@mklement0 Thanks for the info, that looks like a much better long term solution.

@DemiMarie
Copy link

DemiMarie commented May 12, 2021

Can we at least include @mklement0’s msiexec mitigation? If I understand it correctly, it is 100% compatible and requires no special-casing whatsoever still needs special-casing for certain commands 😞.

@mklement0
Copy link
Contributor

@DemiMarie, I suggest posting your comment in issue #15143, as that is where the official decision not to include such accommodations was announced.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

2 similar comments
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added Resolution-No Activity Issue has had no activity for 6 months or more labels Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-No Activity Issue has had no activity for 6 months or more WG-Engine core PowerShell engine, interpreter, and runtime
Projects
None yet
Development

No branches or pull requests

7 participants