The hint set sometimes do not match the actual plan #3

mtakahar · 2023-05-13T02:50:35Z

In some cases, the optimizer does not (or cannot) generate the plan choice specified by a hint. The framework should detect and reject/discard the plans that don't match the hint set.

Unfollowed/follow-able hints typically show up as the disable_cost (10000000000.00) added, but I am not sure whether that's a reliable indication. Since it would be cumbersome and tricky to examine each plan by the script, I think it's okay to look for the very high costs until we find the cases that require other detection mechanism.

Examples:

IndexScan is not considered without an eligible index access predicate or ORDER BY on the range key prefix. The optimizer considers IndexOnlyScan when applicable without the access key or result ordering benefits, however.

{
    "tag": "",
    "query": "select * from t1000",
    "query_hash": "ae2881962926131e01929c5c660b20d8",
    "tables": null,
    "optimizer_tips": null,
    "explain_hints": "IndexScan(t1000)",
    "execution_plan": {"full_str": "YB Seq Scan on t1000  (cost=10000000000.00..10000000100.00 rows=1000 width=4108)"},
    "execution_time_ms": 6.5408203125,
    "result_cardinality": 1000,
    "result_hash": "f6909aba31c4078a1ed6217ffa61f9e2",
    "parameters": [],
    "optimizations": null,
    "execution_plan_heatmap": null
},

The case below is tricky because the optimizer internally converts the IN-subquery to an inner join and assigns "ANY_subquery" alias when it satisfies certain set of rules.
Because of internally created inline view with a special alias, it doesn't recognize t10000w in the Leading hint.

{"tag": "",
 "query": "select * from t10000w where c2 in (select c1 from t100 where exists (select 0 from t1000 where c1 = t100.c2))",
 "query_hash": "1e34dd4a9714a40051ddf779d4620768",
 "tables": null,
 "optimizer_tips": null,
 "explain_hints": "Leading (( ( t10000w t100 ) t1000 ))  HashJoin(t10000w t100) HashJoin(t10000w t100 t1000) SeqScan(t10000w) SeqScan(t100) SeqScan(t1000) ",
 "execution_plan": {"full_str": "Nested Loop Semi Join  (cost=20000000000.00..20000017610.50 rows=400 width=16396)\n  Join Filter: (t10000w.c2 = t100.c1)\n  ->  YB Seq Scan on t10000w  (cost=0.00..1000.00 rows=10000 width=16396)\n  ->  Materialize  (cost=10000000000.00..10000001610.75 rows=100 width=4)\n        ->  Nested Loop  (cost=10000000000.00..10000001610.25 rows=100 width=4)\n              Join Filter: (t100.c2 = t1000.c1)\n              ->  YB Seq Scan on t1000  (cost=0.00..100.00 rows=1000 width=4)\n              ->  Materialize  (cost=0.00..10.50 rows=100 width=8)\n                    ->  YB Seq Scan on t100  (cost=0.00..10.00 rows=100 width=8)"},
 "execution_time_ms": 180.15498046875,
 "result_cardinality": 396,
 "result_hash": "55dc6239660e67c5602c2e2db1dfe960",
 "parameters": [],
 "optimizations": null,
 "execution_plan_heatmap": null
},

It follows the hints by specifying the ANY_subquery alias in place of t10000w. (I'll create a separate issue for this as an enhancement request.):

explain /*+ Leading (( ( t10000w ANY_subquery ) t1000 )) SeqScan(t10000w) SeqScan(t100) SeqScan(t1000) */select * from t10000w where c2 in (select c1 from t100 where exists (select 0 from t1000 where c1 = t100.c2));
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Hash Semi Join  (cost=118.50..1149.20 rows=400 width=16396)
   Hash Cond: (t10000w.c2 = t100.c1)
   ->  YB Seq Scan on t10000w  (cost=0.00..1000.00 rows=10000 width=16396)
   ->  Hash  (cost=117.25..117.25 rows=100 width=4)
         ->  Hash Join  (cost=11.25..117.25 rows=100 width=4)
               Hash Cond: (t1000.c1 = t100.c2)
               ->  YB Seq Scan on t1000  (cost=0.00..100.00 rows=1000 width=4)
               ->  Hash  (cost=10.00..10.00 rows=100 width=8)
                     ->  YB Seq Scan on t100  (cost=0.00..10.00 rows=100 width=8)
(9 rows)

Test case

Please see the model tarball attached to the issue #2

The text was updated successfully, but these errors were encountered:

mtakahar assigned qvad May 15, 2023

mtakahar added the bug Something isn't working label May 15, 2023

mtakahar changed the title ~~Detect and reject the plans that did not (or was not able to) follow the hints~~ The hint set sometimes do not match the actual plan May 15, 2023

mtakahar mentioned this issue May 15, 2023

Produce Leading and join method hints with ANY_subquery alias for IN-subquery #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The hint set sometimes do not match the actual plan #3

The hint set sometimes do not match the actual plan #3

mtakahar commented May 13, 2023 •

edited

The hint set sometimes do not match the actual plan #3

The hint set sometimes do not match the actual plan #3

Comments

mtakahar commented May 13, 2023 • edited

Test case

mtakahar commented May 13, 2023 •

edited