New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json_extract and json_extract_scalar allow invalid JsonPaths (without $ at the start) #22589
Comments
Looks like the reason for Presto to allow such paths is fallback logic to Jayway:
com.jayway.jsonpath.JsonPath.compile(path) happily translates 'foo' path into '$[foo]' path. |
Jayway has logic to check if path starts with '$' or '@' and if not to prepend the path with '$.' before compiling. com/jayway/jsonpath/json-path/2.6.0/json-path-2.6.0.jar!/com/jayway/jsonpath/internal/path/PathCompiler.class
|
Also, JSON Path seems to allow either dot-notation (
|
This logic seems problematic as it produces some unexpected results. ".[0]" path is converted to "..[0]" which performs deep scan and returns a list of first elements for ALL arrays in the JSON document.
I think it is fine to change Velox to treat leading '$' as optional, but I don't think Velox should prepend '$.' to the path when leading '$' is missing. CC: @sviscaino |
…alar] Presto functions Summary: json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56465662
…alar] Presto functions (facebookincubator#9584) Summary: json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56465662
…alar] Presto functions (facebookincubator#9584) Summary: json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56465662
…Presto Summary: json_extract_scalar and json_extract Presto function allow paths like '$.[0]. [1].[2].foo' that include "redundant" dots before opening brackets. These paths are non-standard. They are enabled via use of Jayway engine. This change allows Velox to support these paths by ignoring dots that appear before opening brackets. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56467690
…alar] Presto functions (facebookincubator#9584) Summary: json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56465662
…Presto (facebookincubator#9585) Summary: json_extract_scalar and json_extract Presto function allow paths like '$.[0]. [1].[2].foo' that include "redundant" dots before opening brackets. These paths are non-standard. They are enabled via use of Jayway engine. This change allows Velox to support these paths by ignoring dots that appear before opening brackets. See prestodb/presto#22589 Part of facebookincubator#7049 Reviewed By: xiaoxmeng Differential Revision: D56467690
…alar] Presto functions (facebookincubator#9584) Summary: json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of facebookincubator#7049 Differential Revision: D56465662
…Presto (facebookincubator#9585) Summary: json_extract_scalar and json_extract Presto function allow paths like '$.[0]. [1].[2].foo' that include "redundant" dots before opening brackets. These paths are non-standard. They are enabled via use of Jayway engine. This change allows Velox to support these paths by ignoring dots that appear before opening brackets. See prestodb/presto#22589 Part of facebookincubator#7049 Reviewed By: xiaoxmeng Differential Revision: D56467690
…alar] Presto functions (#9584) Summary: Pull Request resolved: #9584 json_extract_scalar and json_extract Presto function allow paths like 'foo' and 'foo.bar'. These paths are non-standard. They are enabled via use of Jayway engine, which handles paths without leading '$' by prepending the path with '$.'. Path 'foo' becomes '$.foo'. However, this logic has some unexpected side effects. Path '.foo' becomes '$..foo', which uses deep scan operator '..' and matches 'foo' keys anywhere in the document, rather then just at the top level. This change allows Velox to support paths like 'foo' and 'foo.bar[2].bar', but not '.foo' or '[10]'. These paths are handled as if they contained leading '$.' characters. See prestodb/presto#22589 Part of #7049 Reviewed By: xiaoxmeng Differential Revision: D56465662 fbshipit-source-id: c32bfe6c1b2587b9f996605a169abb934fd3d95c
…Presto (#9585) Summary: Pull Request resolved: #9585 json_extract_scalar and json_extract Presto function allow paths like '$.[0]. [1].[2].foo' that include "redundant" dots before opening brackets. These paths are non-standard. They are enabled via use of Jayway engine. This change allows Velox to support these paths by ignoring dots that appear before opening brackets. See prestodb/presto#22589 Part of #7049 Reviewed By: xiaoxmeng Differential Revision: D56467690 fbshipit-source-id: 42866815ff186510679a0ab877e2796d324309c7
Json Path seems to require all paths to start with '$': https://goessner.net/articles/JsonPath/
However, json_extract and json_extract_scalar functions treat leading $ as optional.
Looks like this "bug" was introduced in dbe2ed9#diff-c1770855d6e7c3f3e05245346d9f1ee514df7310d141019e51631124706109e6
Before this change there was an explicit check for leading $:
After, this change, leading $ became optional:
Velox currently requires JSON Paths to start with '$'.
CC: @tdcmeehan @Yuhta @kagamiori @kgpai @amitkdutta
The text was updated successfully, but these errors were encountered: