Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting IN in new Engine [BUG] #71

Closed
FreCap opened this issue May 19, 2021 · 5 comments
Closed

Supporting IN in new Engine [BUG] #71

FreCap opened this issue May 19, 2021 · 5 comments
Labels
bug Something isn't working

Comments

@FreCap
Copy link
Contributor

FreCap commented May 19, 2021

[2021-05-19T01:03:13,199][INFO ][c.a.o.s.l.p.RestSQLQueryAction] [elasticsearch_wqrgrid0660_0] Request is falling back to old SQL engine due to: Failed to parse query due to offending symbol [in] at: '
    SELECT ID FROM crygojxttsi6fvsthw2eaq 
    WHERE (Type in' <--- HERE... More details: Expecting tokens in {'AND', 'IS', 'LIKE', 'NOT', 'OR', 'REGEXP', '*', '/', '%', '+', '-', '=', '>', '<', '!', '.', ')'}```

It looks like the IN is not yet supported in the new engine. 

Some queries that have both `IN` and a comparison between two fields will fail because the old engine doesn't seem to be able to handle the comparison of the two fields, and the new one doesn't handle IN.

I would like to work on a patch, do you know of any specific blocker or macro steps that I should work on? 
A 3-4 liners practical guidance would be great :) 

Thank you!
@FreCap FreCap added the bug Something isn't working label May 19, 2021
@chloe-zh
Copy link
Contributor

Thanks for your contributions! You can check out this pr opendistro-for-elasticsearch/sql#1067 this is to support the BETWEEN...AND... in new engine, should be very similar to IN.

@FreCap
Copy link
Contributor Author

FreCap commented May 20, 2021

@chloe-zh I'm facing some challenges in passing an array of parameters.
I was trying to get inspiration from
image
But it seems not implemented yet.

Do you think we need to create a DataType.ARRAY? (That's what I'm trying to do since otherwise the function with N parameters won't match any impl.
image

Any other advice?

@Yury-Fridlyand
Copy link
Collaborator

Yury-Fridlyand commented Nov 19, 2021

I found few more cases when IN clause doesn't work or causes errors:

  1. IN in WHERE with subquery causes error:
curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query": "SELECT A, B FROM (SELECT `key` as A, DAYOFWEEK(date1) AS B from calcs) tmp WHERE B IN (1, 7)"}'
{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "The following method is not supported in Schema: DAYOFWEEK",
    "type": "UnsupportedOperationException"
  },
  "status": 500
}

while WHERE without IN and with subquery executed successfully:

curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query": "SELECT A, B FROM (SELECT `key` as A, DAYOFWEEK(date1) AS B from calcs) tmp WHERE B = 1"}'
{
  "schema": [
    {
      "name": "A",
      "type": "keyword"
    },
    {
      "name": "B",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      "key03",
      1
    ],
    [
      "key10",
      1
    ]
  ],
  "total": 2,
  "size": 2,
  "status": 200
}
  1. IN in CASE WHEN clause in SELECT statement fails:
curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query" : "SELECT (CASE WHEN (int0 IN (1, 4)) THEN NULL ELSE int0 END) AS res FROM calcs"}'
{
  "error": {
    "reason": "Invalid SQL query",
    "details": "Cannot invoke \"Object.getClass()\" because \"expr\" is null",
    "type": "NullPointerException"
  },
  "status": 400
}
  1. IN in WHERE statement with TIMESTAMP fails:
curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query" : "SELECT `key`, date0 FROM calcs WHERE (date0 IN (TIMESTAMP(\"1972-07-04 00:00:00\"), TIMESTAMP(\"1975-11-12 00:00:00\"), TIMESTAMP(\"2004-06-19 00:00:00\")))"}'
{
  "error": {
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: [calcs/Qpr1MizFQ_CM9wRfEKOCGw] QueryShardException[failed to create query: failed to parse date field [TIMESTAMP('1972-07-04 00:00:00')] with format [yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||date_time_no_millis||epoch_millis]: [failed to parse date field [TIMESTAMP('1972-07-04 00:00:00')] with format [yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||date_time_no_millis||epoch_millis]]]; nested: OpenSearchParseException[failed to parse date field [TIMESTAMP('1972-07-04 00:00:00')] with format [yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||date_time_no_millis||epoch_millis]: [failed to parse date field [TIMESTAMP('1972-07-04 00:00:00')] with format [yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||date_time_no_millis||epoch_millis]]]; nested: IllegalArgumentException[failed to parse date field [TIMESTAMP('1972-07-04 00:00:00')] with format [yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||date_time_no_millis||epoch_millis]]; nested: DateTimeParseException[Failed to parse with all enclosed parsers];\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.",
    "type": "SearchPhaseExecutionException"
  },
  "status": 400
}
  1. IN in complex WHERE statement fails even with simple types:
curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query" : "SELECT `key`, date0 FROM calcs WHERE int0 IN (1,4) OR str0 IN (\"FURNITURE\") OR date0 = TIMESTAMP(\"1972-07-04 00:00:00\")"}'
{
  "error": {
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: ScriptException[runtime error]; nested: DateTimeParseException[Text '1972-07-04 00:00:00' could not be parsed at index 10];\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.",
    "type": "SearchPhaseExecutionException"
  },
  "status": 400
}

But this query is OK:

curl -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query" : "SELECT `key`, date0 FROM calcs WHERE int0 IN (1,4) OR str0 IN (\"FURNITURE\")"}'
{
  "schema": [
    {
      "name": "key",
      "type": "keyword"
    },
    {
      "name": "date0",
      "type": "date"
    }
  ],
  "total": 5,
  "datarows": [
    [
      "key00",
      "2004-04-15 00:00:00.000"
    ],
    [
      "key01",
      "1972-07-04 00:00:00.000"
    ],
    [
      "key10",
      null
    ],
    [
      "key13",
      null
    ],
    [
      "key15",
      null
    ]
  ],
  "size": 5,
  "status": 200
}

Tested on version 1.1.0

@ghost
Copy link

ghost commented Nov 19, 2021

This issue affects TDVT tests:

  • exprtests/standard\setup.logical.txt
  • exprtests/standard\setup.logical.case.null.txt

@penghuo
Copy link
Collaborator

penghuo commented Feb 21, 2022

Fixed by #420

@penghuo penghuo closed this as completed Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants