Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Piping Text To An External Program Appends A Trailing Newline #5974

Open
ThePieMonster opened this issue Jan 22, 2018 · 28 comments
Open

Piping Text To An External Program Appends A Trailing Newline #5974

ThePieMonster opened this issue Jan 22, 2018 · 28 comments
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif KeepOpen The bot will ignore these and not auto-close WG-Engine core PowerShell engine, interpreter, and runtime WG-NeedsReview Needs a review by the labeled Working Group
Projects

Comments

@ThePieMonster
Copy link

Steps to reproduce

Run the below command in Linux, Command Prompt, and PowerShell and compare the output.

echo -n "string" | openssl dgst -sha256 -hmac "authcode"

I also attempted to change the encoding PowerShell was using to UTF-8 but that did not change the value of the returned hash.

[Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8

Expected behavior

Linux/CMD Response: (stdin)= 54ef1d2effbc663eb6dc84a49cc1600b30e79f2e1ff737b99cd96589842d50e9
PowerShell Response:(stdin)= 54ef1d2effbc663eb6dc84a49cc1600b30e79f2e1ff737b99cd96589842d50e9

Actual behavior

Linux/CMD Response: (stdin)= 54ef1d2effbc663eb6dc84a49cc1600b30e79f2e1ff737b99cd96589842d50e9
PowerShell Response:(stdin)= 08daf0944f91c2d904ef9f231c4e767067c9b795197c4fe46631aa78c7e9d0c4

Environment data

> $PSVersionTable
Name                           Value                                                                                       
----                           -----                                                                                       
PSVersion                      5.1.16299.98                                                                                
PSEdition                      Desktop                                                                                     
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                     
BuildVersion                   10.0.16299.98                                                                               
CLRVersion                     4.0.30319.42000                                                                             
WSManStackVersion              3.0                                                                                         
PSRemotingProtocolVersion      2.3                                                                                         
SerializationVersion           1.1.0.1                                                                                     
@markekraus
Copy link
Contributor

markekraus commented Jan 22, 2018

One point of clarity. in pwsh the echo command is an alias for Write-Output. -n on Write-Output is short for -NoEnumerate. it is not the same as the Linux echo binary. However switching the command to /bin/echo -n "string" | openssl dgst -sha256 -hmac "authcode" does not alleviate the issue you are seeing. It appears a new line is always being appended.

powershell (with -n):

/bin/echo -n "string" | openssl dgst -sha256 -hmac "authcode"

result: (stdin)= e0110cef9a8ba1b3ebdf6655bd096ea7e2cbd8790fce71e8767dba374c22461e

bash (without -n):

/bin/echo "string" | openssl dgst -sha256 -hmac "authcode"

result: (stdin)= e0110cef9a8ba1b3ebdf6655bd096ea7e2cbd8790fce71e8767dba374c22461e

@iSazonov iSazonov added WG-Engine core PowerShell engine, interpreter, and runtime Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif labels Jan 22, 2018
@iSazonov
Copy link
Collaborator

Is the problem in cmd.exe and Bash?

@ThePieMonster
Copy link
Author

@markekraus Yes I have tried using echo and Write-Output with and without the -n / NoEnumerate but the hashs returned are always the same regardless what is used.
@iSazonov The issue is with PowerShell since all other applications (cmd.exe, Bash, Linux Distros) return the same hash except for PShell.

@mklement0
Copy link
Contributor

mklement0 commented Jan 22, 2018

You can get the (hopefully) full story and workarounds in this SO answer of mine, but the short of it is:

As @markekraus has already stated, the PowerShell pipeline invariably appends a (platform-appropriate) trailing newline to strings when they're sent to external programs, which you can verify as follows (on a Unix platform):

PS> printf %s h | od -t x1  # try to send "h" without a trailing newline through the pipeline
0000000    68  0a                                                        
0000002

The 0a is the trailing LF that the pipeline added - you won't see it when you run the very same command from Bash.

This is problematic, as is the general inability to pipe raw byte streams (see #1908, which addresses external-program-to-external-program piping, though sometimes you may even want to send raw bytes from a PowerShell command).

@mklement0
Copy link
Contributor

@ThePieMonster: If you agree with my analysis, can you please change the title of this issue to something like "Piping text to an external program appends a trailing newline"?

@ThePieMonster ThePieMonster changed the title PowerShell Hash Different Than Linux Hash Piping Text To An External Program Appends A Trailing Newline Jan 23, 2018
@ThePieMonster
Copy link
Author

ThePieMonster commented Jan 23, 2018

@mklement0 I updated the title as you suggested.

@mklement0
Copy link
Contributor

mklement0 commented Jan 23, 2018

@ThePieMonster: Thanks for updating the title.

Unfortunately, echo -n does not work with cmd, because echo is an internal command that supports no options at all, so the -n is simply output too (try cmd /c "echo -n hi").

Thus, you must use the following on Windows (my linked SO answer states limitations of this approach):

cmd /c "<NUL set /p =`"string`"| ..."     # NO space before | 

If you happen to have the GnuWin32 tools installed on Windows, you can use
the more robust cmd /c "printf `"%s`" `"string`" | ..." instead.

On Unix:

sh -c "printf %s `"string`" | ..."

@ThePieMonster
Copy link
Author

@mklement0 Thanks for the suggestion. The below seems to do the trick for me to get PShell to not have the trailing newline which in turn responds with the same hash as in Linux. (Being careful to not have a space before | like suggested).

cmd /c "<NUL set /p =`"string`"| openssl dgst -sha256 -hmac authcode"

@rkeithhill
Copy link
Collaborator

Maybe an Out-Native cmdlet would be handy??

@wclear
Copy link

wclear commented Oct 31, 2019

Sidenote In case anyone landed here looking to get rid of the trailing newline from piped input, I was able to trim the newline from a one-line output command and pass it to the Set-Clipboard command, for example, with the following:

echo "abc def" | Set-Clipboard -Value {$_.Trim()}

@mklement0
Copy link
Contributor

@wclear:

This issue is about the inability to send a string as-is TO an external program, without having PowerShell append a trailing newline.

In the opposite direction, FROM an external program, as in your example, you usually have the opposite problem: because PowerShell automatically splits output from external programs into an array of lines, you can't tell the difference between a given external program sending you just "foo" or sending "foo`n" (trailing newline).

(In other words: strings come "pre-trimmed" with respect to trailing newlines, and the .Trim() call in your command shouldn't be necessary - unless you need to trim trailing spaces and tabs.)

So we'd need both Out-Native and "In-Native" commands (in the latter case, a solution probably requires intervention at the engine level) to address these scenarios.

The raw native-exe-to-native-exe piping or native-exe-to-file-redirection (#1908) isn't implemented yet either.

I don't know what the right solution is, but, as of PowerShell Core 7.0.0-preview.5, PowerShell and external (native) executables are separate worlds that can only communicate with one another if they "speak text" and always assume that trailing newlines are incidental to the data.

Neither receiving nor sending raw data (bytes) is supported, nor is redirecting an external program's output as-is to a file.

@wclear
Copy link

wclear commented Oct 31, 2019

This issue is about the inability to send a string as-is TO an external program, without having PowerShell append a trailing newline.

Thank you for that clarification @mikeehsu, I see that now. I've updated my comment to be just a sidenote.

@SteveL-MSFT SteveL-MSFT added this to To do in Shell via automation Oct 31, 2019
@SteveL-MSFT SteveL-MSFT added this to the vNext-Consider milestone Oct 31, 2019
@vsalvino
Copy link

vsalvino commented Feb 11, 2020

This becomes an especially difficult problem when piping text to programs cross-platform. For example, running PowerShell (core) on Windows and trying to pipe output to a program running on a Linux server. It simply seems to not be possible at the moment.

Write-Output "uptime" | ssh remote@server "bash -s"

"uptime" | ssh remote@server "bash -s"

"uptime" | Out-String -NoNewline | ssh remote@server "bash -s"

Regardless of the input, the pipe here is adding a CRLF which breaks the remote command.

@SteveL-MSFT
Copy link
Member

The addition of the newline appears to be an issue on both the input and output side. On Windows:

PS> "hello there" | wsl awk --% '{print $2,$1}' | out-string | format-hex

   Label: String (System.String) <0F0E5C9B>

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 74 68 65 72 65 0D 0A 20 68 65 6C 6C 6F 0D 0A    there�� hello��

If you try the equivalent in cmd.exe:

C:\>echo hello there | wsl awk '{print $2,$1}'
there hello

What I think happens is that awk returns two strings to stdout and the PowerShell NativeCommandProcessor gets those strings and does a WriteLine() call which inserts the newlines you can see above in hex as 0D0A (CRLF).

@mklement0
Copy link
Contributor

mklement0 commented Sep 3, 2020

@SteveL-MSFT:

True, on the output side, due to parsing output into an array of lines, PowerShell also does not allow you to distinguish between:

# On Unix: NO trailing newline
$out = printf 'hi'

and

# On Unix: Trailing newline
$out = printf 'hi\n'

$out receives verbatim hi - without a trailing newline - in both cases.

Allowing user code to make this distinction would be a nontrivial proposition, given that invocations of external (native) executables don't support any ad-hoc behavioral options, such as via common parameters.

Tacking Out-String onto such calls is not a solution, because that outputs a single string that invariably ends in a trailing newline whether the original output had one or not (which is problematic in itself - see #14444)

In your wsl example, a separate, wsl-specific problem seems to occur, given that the following works just fine:

# Windows
PS> cmd /c echo there hello | Out-String | Format-Hex

   Label: String (System.String) <2C119606>

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 74 68 65 72 65 20 68 65 6C 6C 6F 0D 0A          there hello��

Similarly, on a Unix-like platform, awk works as expected:

# Unix
PS> "hello there" | awk '{print $2,$1}' | Out-String | Format-Hex

   Label: String (System.String) <029F4149>

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 74 68 65 72 65 20 68 65 6C 6C 6F 0A             there hello�

As an aside: That your wsl command line requires --% to pass the awk script through as intended is a testament to wsl's broken argument handling - see #12975 (comment).

That said, given that a single executable - awk - is being invoked here, with no need to involve a shell, the problem can be avoided by using the -e / --execute wsl option: "hello there" | wsl -e awk '{print $2,$1}'

@petervandivier
Copy link

I'm not sure if this adds anything to the conversation, but as a matter of reference:

$foo = "foo"
$foon = "foo`n"

@(
    '"foo" | xxd'
    '("foo" | Format-Hex).HexBytes'
    '(Format-Hex -InputObject "foo").HexBytes'
    '$foo | xxd'
    '($foo | Format-Hex).HexBytes'
    '"foo`n" | xxd'
    '$foon | xxd'
    '($foon | Format-Hex).HexBytes'
) | % {
    [PSCustomObject]@{
        Command = $_
        Output = iex $_
    }
}

Below results are PS 7.0.3 on Darwin 19.6.0.

Command                                  Output
-------                                  ------
"foo" | xxd                              00000000: 666f 6f0a                                foo.
("foo" | Format-Hex).HexBytes            66 6F 6F
(Format-Hex -InputObject "foo").HexBytes 66 6F 6F
$foo | xxd                               00000000: 666f 6f0a                                foo.
($foo | Format-Hex).HexBytes             66 6F 6F
"foo`n" | xxd                            00000000: 666f 6f0a 0a                             foo..
$foon | xxd                              00000000: 666f 6f0a 0a                             foo..
($foon | Format-Hex).HexBytes            66 6F 6F 0A

This tripped me up a bit even after I thought I knew what was going on. Notably a string ending in a newline still gets a newline appended. I had previously assumed the trailing newline got appended only when missing.

@mklement0
Copy link
Contributor

@petervandivier, good point: a platform-appropriate newline sequence - CRLF on Windows, LF on Unix - is invariably added when a string is sent via the pipeline to a native executable.

@SteveL-MSFT, your wsl problem - which is unrelated to the issue at hand - boils down to this: the CRLF newlines that PowerShell sends on Windows can cause problems when they're sent to Unix utilities such as awk via wsl, because such utilities recognize only LF newlines, and treat CRs as data - I've created a new issue for it: see #13579.

@SteveL-MSFT SteveL-MSFT modified the milestones: 7.2-Consider, 7.3-Consider Dec 7, 2020
@dazinator
Copy link

dazinator commented Jan 20, 2022

For a specific use case (perhaps one is not needed), this makes setting creating docker secrets from stdin impossible from powershell / pscore

 Write-Output "$secretString" | docker secret create "my-secret" -

The secret is created with an additional new line appended making it invalid.

@spinitron
Copy link

spinitron commented Oct 31, 2022

Is there a workaround so that something like this can encrypt without adding trailing newline characters to the secret text? (I didn't see one in the above discussion.)

'pa55w0rd'.Trim() | gpg  -e ...

@mklement0
Copy link
Contributor

@spinitron, call via the platform-native shell:

# Unix
sh -c 'printf %s ''pa55w0rd'' | gpg  -e ...'

# Windows (workaround needed to print a string without a trailing newline)
cmd /c '<NUL set /p ="pa55w0rd" | gpg -e ...'

@mklement0 mklement0 mentioned this issue Jan 27, 2023
5 tasks
@aetonsi
Copy link

aetonsi commented Apr 24, 2023

Hi again!
i incidentally just discovered that an encoding problem also affects subexpressions ($(...)) , but i am not sure if it's the same as the one in this issue.
Can some please tell me if that's the case or not, considering my little analysis below?
if it's not i'll open an issue. if it is, it should be taken into consideration in the PR, i think.
Thanks!


I was trying to save to a file/capture in variable wsl.exe's output, but it always gets messed up.
It seems that its output, which is UTF16-LE encoded, is not correctly detected by pwsh:

[expand: terminal]
PS> wsl --list
Windows Subsystem for Linux Distributions:
ubuntuOLD (Default)
Ubuntu
OracleLinux_8_5
Debian
kali-linux
PS> $(wsl --list)
Windows Subsystem for Linux Distributions:

ubuntuOLD (Default)

Ubuntu

OracleLinux_8_5

Debian

kali-linux


So i got into the command prompt to make a couple of tests:

C:\> pwsh -noprofile -c "wsl --list" >_\raw.txt

C:\> pwsh -noprofile -c "$(wsl --list)" >_\quoted.txt

And the following is the result.
I highlighted what should be the line break between the closing parentheses of Default), and the first following letter U of Ubuntu:

[expand: screenshots of the files opened in HxD]

image
image

It seems that $(...) simply doubles up the line breaks, but it doesn't. It generates a seemingly nonsense sequence of bytes which does not correspond to a double newline in any possible flavour (LFLF, CRCR, CRLFCRLF).

I tried to understand what it actually generates, by trying a conversion to utf8, but i had no luck:

expression utf16LE bytes utf8 bytes Codepoints php command line for utf16le->utf8 conversion
$(wsl --list) 0d0a
000d
0a00
E0A88D
E0B480
0A
U+0A0D
U+0D00
U+000A
mb_convert_encoding(join('',array_map(fn($b)=>chr($b), [0x0d, 0x0a, 0x00, 0x0d, 0x0a, 0x00])), 'utf8', 'utf-16le')
wsl --list 0d00
0a00
0d
0a
U+000D
U+000A
mb_convert_encoding(join('',array_map(fn($b)=>chr($b), [0x0d, 0x00, 0x0a, 0x00])), 'utf8', 'utf-16le')

Please note:

  • commands have been run in cmd.exe to avoid additional messing with the bytes when redirecting the output to the files i then analyzed
  • codepoints seem to be "reversed" due to the original text's indianness (utf16-LE). just to be sure i double checked the codepoints via php. 'U+'.str_pad(dechex(mb_ord(chr(0x0d).chr(0x0a), 'utf-16le')),4,'0',STR_PAD_LEFT) shows that the utf-16le sequence 0d0a is, correctly, the U+0A0D codepoint, as you can see here

@mklement0
Copy link
Contributor

mklement0 commented Apr 24, 2023

I think this is a separate issue, related to character encoding only:

  • The root cause is that wsl doesn't respect the console's code page and outputs UTF-16-encoded text (except if you call a command inside a VM).

    • To avoid the problem, run set WSL_UTF8=1 / $env:WSL_UTF8=1 before the call and - to make PowerShell decode the output correctly in memory - (temporarily) also [Console]::OutputEncoding = [Text.Utf8Encoding]::new()
  • The reason that $(...) / @(...) / (...) and use in a PowerShell pipeline (e.g. pwsh -noprofile -c "wsl --list | Write-Output" >_raw.txt) makes a difference is that line-by-line processing then kicks in, which means that PowerShell parses stdout output into lines - as .NET strings - and relays them individually, with the original newlines removed, and - on output to a file or to an outside caller - re-encodes them and joins them again with platform-native newlines.

    • Because PowerShell recognizes any of the following as newlines - CRLF, LF, CR - it mistakes a single, mis-decoded UTF-16 CRLF newline for two newlines - a CR, followed by a NUL, followed by a LF (and a NUL), resulting in extraneous lines, which on output are each terminated with a CRLF on Windows (where CR and LF occupy one byte each). In effect, individual 0xD bytes and 0xA bytes in the original output turn into 0xD, 0xA byte sequences each.

In other words:

  • With PowerShell as the intermediary - due to (...), ... and currently also in intra-session > redirections - PowerShell's usual (currently invariably) text processing applies (decoding into .NET strings, line-by-line processing, re-encoding on output to native programs / files), and the extra newlines stem only from the character-encoding mismatch.

  • Based on your observation, it appears that when an external caller calls the PowerShell CLI with a command that contains a native-program call that does not involving line-by-line processing, (such as your pwsh -noprofile -c "wsl --list" call from cmd.exe), the raw byte output from that native program is passed through - which currently makes it the only scenario where PowerShell's passes a raw byte stream through.

  • If I understand correctly, Preserve stdout byte stream for native commands #17857 will bring raw-byte support to PowerShell's redirections for native-program output (as well as piping between native programs), so that wsl --list >_raw.txt from inside PowerShell should then pass the raw byte output through to the file.

  • However, the issue at hand - a PowerShell-originated .NET string invariably getting terminated with a platform-native newline when sent to a native program - will require the following technique to avoid the trailing newline; a bit arcane and cumbersome, but at least will be possible now (pragmatically speaking, you may still want to resort to a sh -c workaround on Unix):

# Once PR #17857 is merged.
[Console]::OutputEncoding.GetBytes("string") | openssl dgst -sha256 -hmac "authcode"

# For better performance use a nested array.
(, [Console]::OutputEncoding.GetBytes("string")) | openssl dgst -sha256 -hmac "authcode"

Note: If you use $OutputEncoding in lieu of [Console]::OutputEncoding, you'll get UTF-8 by default - irrespective of the console's actual code page; see

@microsoft-github-policy-service microsoft-github-policy-service bot added Resolution-No Activity Issue has had no activity for 6 months or more labels Nov 16, 2023
@petervandivier
Copy link

I feel pretty strong that this issue should remain open or get another resolution besides auto-close.

This is a huge blocker for using pwsh natively on nix workstations and is quite frankly a bit embarrassing to have to handwave away when trying to evangelize the language to new users.

If something has changed regarding this behavior, it needs to get cross-linked here.

At a bare minimum, this behavior should be formally documented and this issue can get closed as By Design. The absence of new user symptoms should not indicate that this is no longer a problem.

@microsoft-github-policy-service microsoft-github-policy-service bot removed this from the 7.2-Consider milestone Nov 24, 2023
Shell automation moved this from To do to Done Nov 24, 2023
@Hashbrown777
Copy link

Closed "as completed". Is this all just a metric game? At the very least make it not-planned if you're unable to do a custom reason such as "Inactive" (not to mention peter's comment literally making it active again days before the auto-close..)

@vsalvino
Copy link

This is in no way done or completed. It is still an issue as of the latest version of PowerShell. It's a pretty big issue too, primarily on Windows.

Name                           Value
----                           -----
PSVersion                      7.3.9
PSEdition                      Core
GitCommitId                    7.3.9
OS                             Microsoft Windows 10.0.22631
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

@ThePieMonster
Copy link
Author

@PowerShellTeam @SteveL-MSFT Please re-open issue.

@microsoft-github-policy-service microsoft-github-policy-service bot removed the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 24, 2023
@aetonsi
Copy link

aetonsi commented Nov 28, 2023

i received like 5-6 emails about issues "closed", this is one like many others..

@yg-i
Copy link

yg-i commented Apr 28, 2024

I'm learning cryptography and just want to verify a simple hash from the textbook

$ echo -n "hello" | openssl dgst -sha256
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

I've tried the following commands in PowerShell 7.4.2. None of them produces the correct hash (terminating in '24'). Is there seriously no (non-convoluted) way to accomplish this task other than drop into wsl/git-bash?

PowerShell 7.4.2
> echo -n "hello" | openssl dgst -sha256
SHA2-256(stdin)= cd2eca3535741f27a8ae40c31b0c41d4057a7a7b912b33b9aed86485d1c84676

> echo "hello" | openssl dgst -sha256
SHA2-256(stdin)= cd2eca3535741f27a8ae40c31b0c41d4057a7a7b912b33b9aed86485d1c84676

> Write-Output -NoNewline "hello" | openssl dgst -sha256
SHA2-256(stdin)= 3831858cc2cdb9f8ad43c8c62fb7972d87636c87d9947cd4c88399a237f3c72a

> Write-Output "hello" | openssl dgst -sha256
SHA2-256(stdin)= cd2eca3535741f27a8ae40c31b0c41d4057a7a7b912b33b9aed86485d1c84676

> "hello" | openssl dgst -sha256
SHA2-256(stdin)= cd2eca3535741f27a8ae40c31b0c41d4057a7a7b912b33b9aed86485d1c84676

Edit: I see it's possible to use raw byte handling now:

> [Text.Encoding]::UTF8.GetBytes('hello') | openssl dgst -sha256
SHA2-256(stdin)= 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

@SteveL-MSFT SteveL-MSFT added KeepOpen The bot will ignore these and not auto-close WG-NeedsReview Needs a review by the labeled Working Group labels Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif KeepOpen The bot will ignore these and not auto-close WG-Engine core PowerShell engine, interpreter, and runtime WG-NeedsReview Needs a review by the labeled Working Group
Projects
Shell
  
In progress
Development

No branches or pull requests