Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read pipe to function #206

Open
tyilo opened this issue Jul 5, 2012 · 22 comments
Open

Can't read pipe to function #206

tyilo opened this issue Jul 5, 2012 · 22 comments
Labels
bug Something that's not working as intended
Milestone

Comments

@tyilo
Copy link
Contributor

tyilo commented Jul 5, 2012

Test:

function testfun
    set input (cat)
    echo $input
end

echo testing123 | testfun

This should output "testing123", but produces nothing.

It works perfectly in bash:

function testfun
{
    input="$(cat)"
    echo $input
}

echo testing123 | testfun
@maxfl
Copy link
Contributor

maxfl commented Jul 5, 2012

You can use 'read' function as workaround.

function t
    while read -l line
        echo $line
    end
end

@maxfl
Copy link
Contributor

maxfl commented Jul 6, 2012

I just understood, that in fish it doesn't work, because the stdin is piped to 'set' instead of 'cat'.

@waterhouse
Copy link
Contributor

This issue goes deeper than just "set" working a certain way. Piping into a function at all is screwed up. And terminal I/O sent into a function does work, but is still a bit weird--it appears to buffer the input and deliver it all at once. Observe what happens with this function:

~> function meh
       cat
   end
~> # First, the way it's supposed to work.
~> # As input, we press the keys: a RET b RET control-D
~> cat
a
a
b
b
~> cat | cat
a
a
b
b
~> # Now...
~> meh
a
a
b
b
~> # So far so good, but...
~> cat | meh
a
b
^D
... um...
^D
control-D repeatedly does not work
try control-C
Job 1, “cat | meh” has stopped
~> fg
Send job 1, “cat | meh” to foreground
cat: stdin: Interrupted system call
~> jobs
jobs: There are no jobs
~> # Dear lord.
~> # For completeness...
~> meh | cat
a
b
aD
b
~> 

Also, cat | meh | cat behaves the same way, as does cat | begin; cat; end.
I can tell you further that the "cat" that complains about an interrupted system call in cat | meh is the first "cat". That is:

~> cp /bin/cat mycat
~> ./mycat | meh
Job 1, “./mycat | meh” has stopped  #after control-C
~> fg
Send job 1, “./mycat | meh” to foreground
mycat: stdin: Interrupted system call

So there's that. Obviously this is something to do with how fish calls functions and how it constructs pipes into them. Does anyone happen to know about this?

@waterhouse
Copy link
Contributor

Ok, I am finding that running
pbpaste | begin; cat; end
repeatedly in a fresh fish shell, with the clipboard being "23\n", will sometimes just print 23 back out, and will sometimes cause the shell to lock up, at which point control-C can do nothing. I assume this must be a race condition of some sort. Oh boy.

@waterhouse
Copy link
Contributor

Meanwhile, it looks like the signal SIGTTIN is sent to the "mycat" in ./mycat | begin; cat; end:

     21    SIGTTIN      stop process         background read attempted from
                                             control terminal

Then, according to the GNU libc manual: "A process cannot read from the user's terminal while it is running as a background job. When any process in a background job tries to read from the terminal, all of the processes in the job are sent a SIGTTIN signal."

So, looks like the "mycat" either gets started in the background, or is started and then put in the background, when it gets piped into a fish function-kind-of-thing. Perhaps this knowledge will help.

@smokku
Copy link

smokku commented Apr 18, 2016

This backgrounds both sides of a pipe apparently... But giving fg command pulls the process from background allowing it to work as it supposed to.

~ $ alias pjson='python -m json.tool | pygmentize -l json'
~ $ curl -u smoku -X GET -H "Content-Type: application/json" 'https://jira.......' | pjson
Job 4, 'curl -u smoku -X GET…' has stopped
~ $ fg
Enter host password for user 'smoku': ********
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   593    0   593    0     0   1372      0 --:--:-- --:--:-- --:--:--  1375
~ $ fg
{
    "expand": "renderedFields,names,schema,transitions,operations,editmeta,changelog",
    "id": "29874"
}

A bit annoying that I needed to create pjson wrapper script in $PATH instead of simple alias... :(

@krader1961 krader1961 added the bug Something that's not working as intended label Apr 18, 2016
@simotek
Copy link

simotek commented Jun 3, 2016

For my reference this is also openSUSE bug https://bugzilla.opensuse.org/show_bug.cgi?id=963548

@milieu
Copy link

milieu commented Oct 20, 2016

Yay! I think I found my workaround! Thanks to gustafj's comment in issue #110 explaining fish piping syntax, I have come up with this:

function line --argument-names n
    cat 1>| tail -n +$n | head -1
end

@msoucy
Copy link

msoucy commented Nov 26, 2016

This issue, coupled with the default grep function for grep, results in quite a few problems - If this issue isn't going to be fixed any time soon, the default grep alias should probably be removed (or replaced with an abbreviation, maybe?) to at least minimize the occurances.

Using cat as @milieu suggests doesn't seem to fix the issue for me, on Fish 2.3.1 (which I just realized is slightly behind, but it's the version packaged for Fedora 25)

@layus

This comment has been minimized.

@faho

This comment has been minimized.

mqudsi added a commit to mqudsi/fish-shell that referenced this issue Aug 21, 2017
While the idiomatic fix to fish' myriad of job control issues would be
to parse all functions prior to beginning the job pipeline so that
everything in the command line can be executed in the context of a
single job, that would require a huge effort to rewrite the core job
flow in fish and does not make sense at this time.

Instead, this patch fixes fish-shell#3952 and fish-shell#206 (but notably not fish-shell#4238) by
having jobs that are part of a single command pipeline, including those
that are functions executing external commands, use the same process
group. This prevents a (parent|child) from crashing with SIGTTIN or
hanging at SIGTTOU because it has a different process group than the
process currently in control of the terminal.

Additionally, since this fix involves removing the code that forces fish
to run in its own process group (which IMHO never made sense, as job
control is the job of the shell, not the process being launched), it
also fixes fish-shell#3805 and works around BashOnWindows#1653.
@fish-shell fish-shell deleted a comment from c02y Aug 21, 2017
@fish-shell fish-shell deleted a comment from milieu Aug 21, 2017
mqudsi added a commit to mqudsi/fish-shell that referenced this issue Sep 8, 2017
While the idiomatic fix to fish' myriad of job control issues would be
to parse all functions prior to beginning the job pipeline so that
everything in the command line can be executed in the context of a
single job, that would require a huge effort to rewrite the core job
flow in fish and does not make sense at this time.

Instead, this patch fixes fish-shell#3952 and fish-shell#206 (but notably not fish-shell#4238) by
having jobs that are part of a single command pipeline, including those
that are functions executing external commands, use the same process
group. This prevents a (parent|child) from crashing with SIGTTIN or
hanging at SIGTTOU because it has a different process group than the
process currently in control of the terminal.

Additionally, since this fix involves removing the code that forces fish
to run in its own process group (which IMHO never made sense, as job
control is the job of the shell, not the process being launched), it
also fixes fish-shell#3805 and works around BashOnWindows#1653.
@mqudsi
Copy link
Contributor

mqudsi commented Oct 9, 2018

I think some of the original behavior in this regard has changed, the below is with regards to fish master/3.0

There are two fundamental issues fish gets wrong here, the first is buffering function/block output (I'm pretty sure a modern shell would not buffer anything anywhere) and the second is being unable to correctly chain input/output across a block. There is a lot of ambiguity (or at least room for acceptable differences of opinion) in what the correct behavior should look like in some corner cases, but I don't think anyone will defend what fish does currently as being optimal.

In general, you have external commands (and builtins which are effectively treated the same way, by and large) which are easy: one input, two outputs, one of which can be chained to subsequent command, the other of which must be redirected to a file or the tty. But blocks and functions are tricky since you're basically mapping an input (as there can only be one) to a sequence of (what eventually expand to) external commands or builtins.

That said, I disagree that the current behavior is wrong. (cat) should not read data that was piped into the command it is executed within.:

mqudsi@ZBook /m/c/U/m/Documents> type testfun
testfun is a function with definition
function testfun
    set input (cat)
    printf "You said '%s'\n" $input
end
mqudsi@ZBook ~/r/fish-shell> echo testing123 | testfun
hello
^D
You said 'hello'

You are piping input into the block, regardless of whether set consumes the input, consumes part of the input, or ignores the input entirely, cat is correct to connect to /dev/tty for input, which then correctly gets passed to the shell for substitution into the commandline. In fact, there are/were (many) bugs filed against this repo complaining about cases where "subshells" were not reading from the terminal when executed with some levels of indirection. IMHO, it is bash that is broken here, especially since bash supports real subshells and offers asynchronicity here.

The only broken behavior I would say stems from cases where external commands are launched in a function/block and do not fully consume the input:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        head -n1 | read -l line1
                                        head -n2 | read -l line2
                                        echo line1: $line1
                                        echo line2: $line2
                                    end
line1: foo
line2:

TBH I'm very surprised but this works correctly:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        /bin/echo 'hi from echo'
                                        cat | read -z from_cat
                                        printf 'from_cat: "%s"' $from_cat
                                    end
hi from echo
from_cat: "foo
bar
"

And this is also correct:

mqudsi@ZBook /m/c/U/m/r/fish-shell> printf 'foo\nbar\n' | begin
                                        cat | read -zl from_cat1
                                        cat | read -zl from_cat2
                                        printf 'from_cat1: "%s"\n' $from_cat1
                                        printf 'from_cat2: "%s"\n' $from_cat2
                                    end
from_cat1: "foo
bar
"
from_cat2: ""

Especially when taking into account the plan of someday introducing real subshells in fish with asynchronous execution, I would say that fish's behavior in regards to the original case reported here is correct. In fact, I'm inclined to close this issue entirely, unless anyone objects and can make a convincing argument here.

@mqudsi
Copy link
Contributor

mqudsi commented Oct 9, 2018

While the original bug report is imho invalid, the issues raised by @waterhouse are spot-on and great catches. But the good news is that #5219 appears to fix them, including the cat | meh case reported.

mqudsi@ZBook ~/r/fish-shell> cat | meh
a
a
b
b
^D
mqudsi@ZBook ~/r/fish-shell>

@faho
Copy link
Member

faho commented Oct 10, 2018

That said, I disagree that the current behavior is wrong. (cat) should not read data that was piped into the command it is executed within.:

I very much disagree on that!

cat is correct to connect to /dev/tty for input

That's a question of the mental model. I would say that cat connects to "the current stdin" for input. If the function or block isn't redirected, that's the tty. If it is redirected, that's that! So connecting to /dev/tty here would be incorrect.

complaining about cases where "subshells" were not reading from the terminal when executed with some levels of indirection

Note that those were all about "global" command substitutions. E.g. running echo (fzf) on the commandline. In that case, there is no stdin.

So what I would say would work sort of like this:

echo | echo (cat) # from tty

begin
   echo | echo (cat) # from file
end < file

There is a related issue (#1035) that asks about stderr in this case, and that that isn't redirected. Which was quite an issue with the old math function, because that happened to feature a command substitution inside of it, and so you couldn't redirect that.

This is the stdin part of it. If a function does a bare (cat), is it really useful to always have that read from the tty? Or couldn't you just use </dev/tty in that case?

@mqudsi
Copy link
Contributor

mqudsi commented Oct 10, 2018

Interesting thoughts.

I guess it boils down to whether the parentheses denote simple substitution (i.e. "pretend the contents of the parentheses were on the line above, run them to completion, store the result in a variable, and substitute the variable here") or if they're (currently broken) subshells. I thought the consensus was that fish is missing proper subshell support, but the intention has always been to fix that "at some point."

If it's the former, then yes, I agree, the current behavior is broken because if you move the contents of the parentheses to a different line, it should certainly read from the input being redirected to the block.

But subshells are a much more powerful concept tha that, and they let you do things that aren't possible with command substitution and to create much more responsive and capable scripts. While it's technically possible to connect whatever is being input into the block to the stdin of a command executed in a subshell, I think that would be incompatible with the mental model there.

@faho
Copy link
Member

faho commented Oct 11, 2018

whether the parentheses denote simple substitution (i.e. "pretend the contents of the parentheses were on the line above, run them to completion, store the result in a variable, and substitute the variable here") or if they're (currently broken) subshells.

I don't think these terms are defined clearly enough to be of too much use here.

For me, it comes down to what's more natural, more typical and more useful.

Reading from the terminal is certainly useful, and sometimes you want to read from the terminal even though you have another stdin (e.g. fzf does basically exclusively this).

But I think that reading from stdin is far more typical, especially considering that non-interactive uses won't read from tty at all. And since reading from tty is still possible (via that </dev/tty redirection), it seems okay to leave that as the secondary option.

@mqudsi
Copy link
Contributor

mqudsi commented Oct 13, 2018

The fact that there's no opposite of </dev/tty in the model I'm suggesting is making me reconsider my position.

@tflori

This comment has been minimized.

@faho

This comment has been minimized.

@tflori

This comment has been minimized.

@faho

This comment has been minimized.

@vergenzt
Copy link

vergenzt commented Dec 1, 2022

I just got bit by this while trying to implement a fish function that operates on variadic input args similar to the string builtins: use $argv if present, otherwise read arguments from stdin.

My first implementation that I thought would obviously work was:

function my-function
  argparse ...
  if not count $argv >/dev/null
    set argv (cat)
  end

  # ...
end

However, on fish version 3.5.1, that tries to read from the terminal (even when I run printf %s\n foo bar baz | my-function! I was very surprised by this) instead of the outer function's stdin.

I then wasted about 30 minutes trying to research why my command substitution wasn't inheriting stdin and whether there's a way to force it to. (Seems the answer is no!)

So my two alternatives appear to be:

  if not count $argv >/dev/null
    read --null --list --delimiter=\n argv
  end

or

  if not count $argv >/dev/null
    while read --line next_argv
      set -a argv $next_argv
    end
  end

The read --null --list --delimiter=\n option looks so ugly to me -- and also is incompatible with the input containing nulls. (Not likely to come up, but seems like such a random constraint to have to explain in a man page.)

The while read --line ...; set -a ...; end option is fine I guess, but it also just seems like such a hassle compared to set argv (cat).

Oh well! 🤷‍♂️ At least there's a workaround for my case. (I think I'm gonna go with the latter by the way.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that's not working as intended
Projects
None yet
Development

No branches or pull requests