Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Path and RESP3 #1090

Open
Didas-git opened this issue Sep 16, 2023 · 15 comments · May be fixed by #1130
Open

JSON Path and RESP3 #1090

Didas-git opened this issue Sep 16, 2023 · 15 comments · May be fixed by #1130

Comments

@Didas-git
Copy link

Didas-git commented Sep 16, 2023

In RedisJSON 2.6 RESP3 support was added and with this came a frightening change which was the deprecation notice for "legacy paths".

This brings some problems to the table that were not covered in the changelogs that i think are crucial for users to know because they will break a lot of code when switching over to RESP3.

The problem

The main problem here is that when the deprecated paths get removed there will be no way to retrieve the information without it being wrapped in an array, and this gets worse if your root is an array, this also introduces an issue where commands such as JSON.GET default to $ instead of . and this is what will break a lot of code.

Let me exemplify what this translates to in code.

Base:

JSON.SET ObjectExample "$" '{ "a": 1, "b": 2 }'
JSON.SET ArrayExample "$" '[3, { "c": 4 }]'

RESP2:

JSON.GET ObjectExample
{
  "a": 1,
  "b": 2
}
JSON.GET ArrayExample
[
  3,
  {
    "c": 4
  }
]

RESP3:

JSON.GET ObjectExample
[
  {
    "a": 1,
    "b": 2
  }
]
JSON.GET ArrayExample
[
  [
    3,
    {
      "c": 4
    }
  ]
]

With that said, that is not the only problem getting rid of "legacy paths" will cause.
Another big problem is in the behavior of JSON.MGET

Using .

JSON.MGET ObjectExample ArrayExample "."
[
  {
    "a": 1,
    "b": 2
  },
  [
    3,
    {
      "c": 4
    }
  ]
]

Using $

JSON.MGET ObjectExample ArrayExample "$"
[
  [
    {
      "a": 1,
      "b": 2
    }
  ],
  [
    [
      3,
      {
        "c": 4
      }
    ]
  ]
]

There are also other commands that will get affected by this but i think this examples are enough to get my point.

While the addition of RESP3 is welcome the changes that came along are not that desired, while removing unnecessary features is good what we are seeing here is the removal of a core feature that will just leave users hanging with no choices.

If this doesn't get addressed i can see a lot of issues being opened in the future regarding it because its not what users want, users want to be able to get their data without any additional work, returning an array that will always be discarded is a waste of memory, bandwidth and performance.

The Solution(s)

With that said, I'm not saying that we should reverse de deprecation warning but instead that at least an alternative is given.

I was searching for a while on the JSON Path "spec" and it seems like there is not real standard that everyone follows and each one has their own twist or difference so i don't think we should sit here and start arguing that x spec says y and instead we should focus on a way to provide the feature that is being removed in another way because user satisfaction should be the priority over a "spec" that changes depending on what source you look at.

To find a solution we have to keep somethings in mind:

  1. Do not change any of the current behavior
  2. Do not break compatibility with pre-resp3 version
  3. Do not overcomplicate something that used to be simple

The first idea that comes to mind would be allowing $* since this syntax resembles XPath just like the majority of JSONPath does. I think this is the simplest and fastest solution however there might be some complications that I'm not aware that could make this not viable.

One option that came to mind while writing this was to allow "top level" @ but it might interfere with RediSearch.

Talking about RediSearch we also have the option of adding something like RediSearch's DIALECT option but to instead control how $ will behave when alone, however this will most likely go against one of the points mentioned above.

Conclusion

I hope this gets addressed and i would highly appreciate that anyone who reads this gives their own honest opinion, i see this as something that both the community and the devs should agree on before changing so we don't run into the same problem again.

@LiorKogan
Copy link
Member

LiorKogan commented Sep 16, 2023

@Didas-git, thank you.

The "extra array" makes sense once you consider that a non-legacy path can have multiple matches. Consider the following:

redis> JSON.SET a $ '{"a":1, "b":2}'
OK

Now, this would be the result under both RESP2 and RESP3:

redis> JSON.GET a $.*
"[1,2]"

Now, in your example, under RESP2:

redis> JSON.GET ObjectExample
"{\"a\":1,\"b\":2}"

but this is equivalent to JSON.GET ObjectExample . and the result will have no "extra array" under both RESP2 and RESP3.

While under RESP3:

redis> JSON.GET ObjectExample
"[{\"a\":1,\"b\":2}]"

but this is equivalent to JSON.GET ObjectExample $ and the result will have an "extra array" under both RESP2 and RESP3.


So this is not about RESP2 vs., RESP3, but about our decision to change the default path under RESP3 to $. This is something we wanted to introduce a long time ago, but the introduction of RESP3 seems to be an appropriate time. You can stay with RESP2 to avoid a breaking change until you make the required fixes in your code.

There are multiple reasons to prefer 'non-legacy' paths. Just to mention one of them: the current format is JSON.GET is JSON.GET key [INDENT indent] [NEWLINE newline] [SPACE space] [path [path ...]]. Now, suppose that you want to refer to a legacy path named INDENT. This requires some tricky code we want to avoid when parsing commands.

Note: to clarify the terminology: a non-legacy path is any path that is equals to $, starts with $., or starts with a $[.

@Didas-git
Copy link
Author

Didas-git commented Sep 16, 2023

No, the issue is about having no way to reproduce the old behavior.
Sure the defaults changed and were used as example but as i mentioned in the issue the problem is not removing legacy operators or changing the defaults, the problem is that before you had a way to return the entire object without the need for an extra array.
$.* has not even been mentioned because it has always existed.
But $ is the default now which doesnt behave the same as the old . and when the old methods get removed there will be no way to get the same results.
JSON.MGET is one of the commands that gets most affected by this, coding using JSON.MGET with path $ will have to lose a lot of time "unwrapping" every object from each array.
And i dont think that saying "just stay in RESP2" is the right move either because moving over RESP3 should be recommended but i dont think people will like to be forced to change their entire codebases just because there is no alternative to what used to be default behavior.

You mentioned how . can have multiple matches but i haven't seen anything like that and i do not understand how that would be possible.

@LiorKogan
Copy link
Member

I mentioned that non-legacy paths can have multiple matches, not ..

@Didas-git
Copy link
Author

Didas-git commented Sep 16, 2023

I saw you added more to the message so this is a reply to that.
I agreed with the solution of removing legacy paths because they do have issues but, the main problem and the main issue that i was hopping this issue would make clear is that there is no way to reproduce the old behavior, there should at least be an alternative that would give us the same output as . because even some of the official libraries rely on that behavior.

@Didas-git
Copy link
Author

I mentioned that non-legacy paths can have multiple matches, not ..

I appologise for missunderstanding that part, however, that still doesnt make much sense to me, how can they have multiple matches? A path should only represent 1 value since they are unique.

@LiorKogan
Copy link
Member

How can they have multiple matches?

I added a simple example above:

redis> JSON.SET a $ '{"a":1, "b":2}'
OK
redis> JSON.GET a $.*
"[1,2]"

As you can see - $.* has two matches.

Here is another example (recursive descent):

redis> JSON.SET b $ [[[1,2,3],[4,5,6]],[7,8,9]]
OK
redis> JSON.GET b $..[0]
"[[[1,2,3],[4,5,6]],[1,2,3],1,4,7]"

@LiorKogan
Copy link
Member

LiorKogan commented Sep 16, 2023

there should at least be an alternative that would give us the same output as .

This is something we will consider: obsolete all legacy path expressions, except ..

@Didas-git
Copy link
Author

How can they have multiple matches?

I added a simple example above:

redis> JSON.SET a $ '{"a":1, "b":2}'
OK
redis> JSON.GET a $.*
"[1,2]"

As you can see - $.* has two matches.

Here is another example (recursive descent):

redis> JSON.SET b $ [[[1,2,3],[4,5,6]],[7,8,9]]
OK
redis> JSON.GET b $..[0]
"[[[1,2,3],[4,5,6]],[1,2,3],1,4,7]"

I dont think we mean the same when we say matches, $.* just says return the value for every key in the root, it is technically 1 match because the output is the same given the same object, having more than 1 match would mean different results given the same object. At least thats how i see matching being used in this scenario.

@LiorKogan
Copy link
Member

Maybe this is just a terminology issue. Anyway, we need to traverse the JSON tree, find matches, and return them as an array. This is why in the general case non-legacy paths should have that 'extra array' in the reply.

@Didas-git
Copy link
Author

there should at least be an alternative that would give us the same output as .

This is something we will consider: obsolete all legacy path expressions, except ..

I will wait for a follow up on that then.
I do understand if . gets removed because of the aforementioned problems but if it does a proper replacement should be given and properly documented.

And on the note of documentation, i think that the default behavior changes for JSON.GET (and other affected commands, i didnt check all of them) between RESP2 and RESP3 should be documented, i wasn't able to find anything on it and i found about this due to someone asking for help and experimenting arround.

@Didas-git
Copy link
Author

Maybe this is just a terminology issue. Anyway, we need to traverse the JSON tree, find matches, and return them as an array. This is why in the general case non-legacy paths should have that 'extra array' in the reply.

It makes sense but specially when fetching the root (fetching the entire object) it just brings an undesired "abstraction" and extra work to the user.

Perhaps it could be checked if the path is just $ and return the entire object but that would break code that is using $ so thats also not desirable so maybe keeping . only might be a good solution.

@LiorKogan
Copy link
Member

Note the following change in the 2.6 GA release notes:

Instead of:

  • Legacy paths (paths that don't start with either $. or $[ or equal to $) are now regarded as deprecated

We now have:

  • Legacy paths (paths that don't start with either $. or $[ or equal to $) except those starting with . are now regarded as deprecated

@Didas-git
Copy link
Author

Note the following change in the 2.6 GA release notes:

Instead of:

  • Legacy paths (paths that don't start with either $. or $[ or equal to $) are now regarded as deprecated

We now have:

  • Legacy paths (paths that don't start with either $. or $[ or equal to $) except those starting with . are now regarded as deprecated

What about the default behavior changes in some of the commands? Will that be reverted back or added to the documentation somehwere?

The change of the main/default prefix was not documented anywhere and since . is not deprecated anymore this is one of the only breaking changes that is not documented which will lead to some issues when people try to move over RESP3, having it documented or having a simple "Moving to RESP3" added somewhere in the docs could come a long way to help ease in the transition since people would at least be aware of it.

@LiorKogan
Copy link
Member

We are not going to revert it.

And yes, we still need to document RESP3 (which is true for everything in Redis, not just JSON).

@Didas-git
Copy link
Author

We are not going to revert it.

Thats what i guessed...
Could we at least get some sort of warning or note that is really explicit with this change when the RESP3 docs come out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants