Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming rows as comma-separated JSON that form and array #106

Open
jeromegn opened this issue Dec 2, 2023 · 2 comments
Open

Streaming rows as comma-separated JSON that form and array #106

jeromegn opened this issue Dec 2, 2023 · 2 comments

Comments

@jeromegn
Copy link
Member

jeromegn commented Dec 2, 2023

Not all languages / frameworks / libraries have an easy time dealing with NDJSON out of the box, even if they have streaming capabilities.

It might be interesting to explore and alternate format: streaming as a normal JSON array.

How? First we send the [ character, then we send each row as an independent JSON object (well, an array? could be configurable to set the column names properly), send a comma in between, send a ] at the end.

This way we don't have to hold onto all the rows in-memory but can still stream a normal JSON array.

@lillianberryfly
Copy link

I just submitted a basic implementation in #110.

This works:

lillian@mbp corrosion % curl -s -H content-type:application/json 'localhost:8080/v1/queries?ndjson=false' -d '"select * from todos"' | jq
[
  {
    "columns": [
      "id",
      "title",
      "completed_at"
    ]
  },
  {
    "row": [
      1,
      [
        1,
        "get groceries",
        1701627653
      ]
    ]
  },
  {
    "row": [
      2,
      [
        2,
        "write some code",
        null
      ]
    ]
  },
  {
    "eoq": {
      "time": 9.59E-7
    }
  }
]

NDJSON still works as well (at least it seems like it works? the test is failing!):

lillian@mbp corrosion % curl -s -H content-type:application/json 'localhost:8080/v1/queries?ndjson=true' -d '"select * from todos"'
{"columns":["id","title","completed_at"]}
{"row":[1,[1,"get groceries",1701627653]]}
{"row":[2,[2,"write some code",null]]}
{"eoq":{"time":2.583e-6}}

(it took me a minute to figure out that it still returns NDJSON, jq apparently knows how to parse it)

Unfortunately my current implementation requires the client to specify whether it wants NDJSON or not; there's no default at the moment.

Also... I'm not sure I like getting data like this in the first place. For a client that doesn't support (or want) NDJSON, it would be better to parse a JSON object like this:

{
  "columns": [
    "id",
    "title",
    "completed_at"
  ],
  "rows": [
    [
      1,
      [
        1,
        "get groceries",
        1701627653
      ]
    ],
    [
      2,
      [
        2,
        "write some code",
        null
      ]
    ]
  ],
  "eoq": {
    "time": 9.59E-7
  }
}

Maybe we could have NDJSON, regular comma-separated JSON, and this third method as options, while defaulting to NDJSON? Let me know!

@jeromegn
Copy link
Member Author

jeromegn commented Dec 7, 2023

I think this might be more practical if there was a way to specify more settings about the expected format, e.g. "just send me JSON objects in an array" where the column names would be used as fields in the JSON object, no need for a columns object at the start.

eoq is not be necessary, but it doesn't hurt anything. Possibly something configurable too.

The reason for NDJSON is so clients don't need to buffer large quantities of data. It's not always possible or required to do that so the alternative comma-separated format sounds nice.

I think the default format for a client that just wants a big valid JSON array should be:

[
  {"col1": "value", "col2": 12345, "col3": null},
  {"col1": "value", "col2": 12345, "col3": null},
  {"col1": "value", "col2": 12345, "col3": null}
]

because they'll probably have to re-arrange the data themselves if we're sending it in a different format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants