Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream delimiters in fromstream #3037

Open
ab-pm opened this issue Feb 8, 2024 · 0 comments
Open

Stream delimiters in fromstream #3037

ab-pm opened this issue Feb 8, 2024 · 0 comments

Comments

@ab-pm
Copy link

ab-pm commented Feb 8, 2024

When using --stream or tostream, I was surprised to find some elements in the stream that have only a path but no value. After some experimentation, I found that these are emitted whenever an object or array ends, with the path being the same as the path of the previous leaf.

$ echo '{"a":13, "b":[{"x":true}, {}, 42]}' | jq --stream -c '.'
[["a"],13]
[["b",0,"x"],true]
[["b",0,"x"]]
[["b",1],{}]
[["b",2],42]
[["b",2]]
[["b"]]

Same with jq -nc 'inputs | tostream'.

Issue 1

First, I would like to report a documentation request. I found this behaviour is documented

Streaming forms include […] [<path>] (to indicate the end of an array or object)

but there is no explanation why these are necessary, how they should be treated, or what the significance of the path is in these "object/array terminators". Why not use/allow any other value?

(I can guess that fromstream needs them to be able to generate multiple results, but this is not obvious - and doesn't explain why this particular format is needed)

Issue 2

Second, I would like to report a bug / feature request: fromstream fails silently when these terminator values are missing or invalid.
I found by experimentation that a valid terminator is any array that has a non-empty array as its single element. It doesn't matter what the values inside the inner array are, it just mustn't be empty. [] and [[]] are ignored.

Reproduction

$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true], [["b",0,"x"]], [["b",0]], [["b"]])'
{"a":13,"b":[{"x":true}]} # as expected

$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true], [["b"]])'
{"a":13,"b":[{"x":true}]} # works even without terminators for the inner object and array

$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true], [[ {}, null, false ]])'
{"a":13,"b":[{"x":true}]} # really works with arbitrary values in the path

$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true])'
$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true], [])'
$ jq -nc 'fromstream([["a"],13], [["b",0,"x"], true], [[]])'
# no output at all!

Expected behaviour

I would expect this to either just generate the output once the input generator ends, or to throw an error about the missing terminator.

Background

I was trying to merge multiple objects that are provided separately. I later found out this can be achieved easier with

$ echo '{"a":13} {"b":42}' | jq -c --slurp 'reduce .[] as $obj ({}; . * $obj)'
{"a":13,"b":42}
# or
$ echo '{"a":13} {"b":42}' | jq -cn 'reduce inputs as $obj ({}; . * $obj)'
{"a":13,"b":42}

but my first attempt was to use streams

$ echo '{"a":13} {"b":42}' | jq -c --stream 'fromstream(.)'
# no output at all?! bug?
$ echo '{"a":13} {"b":42}' | jq -c --stream --slurp 'fromstream(.[])'
{"a":13}
{"b":42}
# not what I wanted, but ok
$ echo '{"a":13} {"b":42}' | jq -cn --stream 'fromstream(inputs)'
{"a":13}
{"b":42}
# same thing

before I realised that I had to remove (filter out) the terminators between the objects and then append my own in the end:

$ echo '{"a":13} {"b":42}' | jq -n --stream 'fromstream((inputs | select(has(1))), [[0]])' -c
{"a":13,"b":42}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant