Consensus across result types #44

cburgmer · 2020-06-07T20:15:36Z

Since #37 we now support finding a consensus independent on the different types of results the implementations give.
We handle

Array vs scalar results,
Empty responses (empty list or null) vs. not found errors.

Those rules now introduce a "mini-consensus" in itself, as they may argue over the type of the query:

Scalar with result
Non scalar with result
Scalar without result
Non scalar without result

Currently, at the time of writing, the following example seems to have implementations argue for either, scalar or non-scalar:
https://cburgmer.github.io/json-path-comparison/#array_slice_on_object

It has a clear consensus on no matches found. However:

3 implementations insist that it's a scalar query:

Kotlin_com.nfeld.jsonpathlite (returns null)
Java_com.jayway.jsonpath (returns NOT_FOUND, which it reserves for scalar queries only)
Objective-C_SMJJSONPat (returns NOT_FOUND, which it reserves for scalar queries only)

3 implementations insist that it's a non-scalar query, as all support scalar responses yet return an empty array:

Golang_github.com-PaesslerAG-jsonpath
Golang_github.com-bhmj-jsonslice
Elixir_warpath

This is not a theoretical issue, but one that will decide which of the implementations will fail the consensus, as the results may switch from [] to None or not found error or vice versa.

Now, this might surface an actual problem in the implementations: What would the user expect, that the query always is understood to be of the same type (scalar vs non-scalar) regardless of the document it is executed against, or not?

The problem however that I currently see is that the simple majority decides who "wins". The consensus rule of simple majority + 2 is not applied here.

One possible solution could be to ignore the differences and relax the requirement for implementations to be consistent in their response format.

Or we just accept that for the smaller set of implementations which tend to handle scalar responses differently to the majority, we require a less comfortable consensus.

The text was updated successfully, but these errors were encountered:

cburgmer · 2020-06-07T20:23:36Z

For the given example I raised an issue to follow up codeniko/JsonPathKt#5.

glyn · 2020-06-08T09:41:17Z

Now, this might surface an actual problem in the implementations: What would the user expect, that the query always is understood to be of the same type (scalar vs non-scalar) regardless of the document it is executed against, or not?

If an implementation documents its approach, either explicitly or in tests, then presumably that's what the users of that implementation would expect.

Do you think there are reasons why users would have more general (i.e. implementation independent) expectations around scalar vs non-scalar results?

The problem however that I currently see is that the simple majority decides who "wins". The consensus rule of simple majority + 2 is not applied here.

Not sure I follow. How do "winners" appear in the comparison matrix if there isn't a consensus?

One possible solution could be to ignore the differences and relax the requirement for implementations to be consistent in their response format.

I'm attracted to that approach as it avoids getting bogged down with the questions raised in this issue. I suppose it depends what the goal of the comparison project is.

Or we just accept that for the smaller set of implementations which tend to handle scalar responses differently to the majority, we require a less comfortable consensus.

Do we really want to get into multiple consensuses? That another question about the goals of this project.

cburgmer · 2020-06-08T19:12:23Z

Now, this might surface an actual problem in the implementations: What would the user expect, that the query always is understood to be of the same type (scalar vs non-scalar) regardless of the document it is executed against, or not?

If an implementation documents its approach, either explicitly or in tests, then presumably that's what the users of that implementation would expect.

That's a good point.

We are not trying to shed any more light into different ways to implement responses, but rather work as best as possible with what the implementations offer us.

The problem however that I currently see is that the simple majority decides who "wins". The consensus rule of simple majority + 2 is not applied here.

Not sure I follow. How do "winners" appear in the comparison matrix if there isn't a consensus?

I thought a bit more on where the problem described here comes from, and why it did not exist beforehand. The big change last week was in fact three slightly related ones:

Support certain errors as a consensus
Support different response types, adapted to the specific implementations (e.g. [], null, NotFoundError)
Automatically deduce whether a query is of scalar or non-scalar nature (Java_com.jayway.jsonpath calls it definite vs. indefinite)

The piece of information changed in 3. has always been important, so we don't accept e.g. null from a scalar implementation if the query allows for multiple matches and the implementation otherwise consistently returns [] in such cases.

Now the latter change 3. introduced the issue here.
Initially every query had its type hardcoded, decided by me. A file SCALAR_RESULT placed next to the query would tell the comparison to handle results of scalar implementations specifically.
Now this is not defined in the query anymore, but is automatically selected by just counting where the scalar implementations fall with their response.

For "array slice on object" the absence of this file used to indicate that a non-scalar response was expected. And thus Java_com.jayway.jsonpath used to fail the consensus. Now that this hard requirement was lifted, it's 3 against 3 whether it's scalar or not.

So, coming back to your point, that this is not a problem of consensus, but a problem of integrity of the implementation towards its documentation. I'm reading the documentation of Java_com.jayway.jsonpath and it does not call out array slice in https://github.com/json-path/JsonPath/#what-is-returned-when, although it does list the other notations.
To me it seems that their concept of definite (and indefinite) is purely tied to the selector. I will therefor raise a bug and seek to clarify with upstream.

For us, this leaves me with the question: As it is necessary for the comparison to handle different return types - we need to know whether a query is scalar or not - should we rather hard code the knowledge again, or is it fairer to look at the majority of scalar implementations to decide amongst them what it is?

I would possibly wait for a response for both the Java and Kotlin implementations (the ObjectiveC one says it will implement everything on par with Java_com.jayway.jsonpath anyway), and see whether that adds any more context. Otherwise I might favour to ignore this issue. The automatic aspect makes maintaining queries easier, and will seldom be wrong.

cburgmer mentioned this issue Jun 7, 2020

Support heterogenous responses #37

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consensus across result types #44

Consensus across result types #44

cburgmer commented Jun 7, 2020

cburgmer commented Jun 7, 2020

glyn commented Jun 8, 2020

cburgmer commented Jun 8, 2020

Consensus across result types #44

Consensus across result types #44

Comments

cburgmer commented Jun 7, 2020

cburgmer commented Jun 7, 2020

glyn commented Jun 8, 2020

cburgmer commented Jun 8, 2020