You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had searched in the issues and found no similar issues.
Description
In multi-table join scenario, The result of intermediate join will be used as the input of the subsequent joins.
During the interval between updating statistics, when we update data, statistics is not collected in time, If we run a multi-table join query within this time interval, the optimizer may choose a poor strategy because there is no correct statistics.
for example:
select * from example_tbl t1 join example_tbl02 t2 on t1.city=t2.city and t1.city="成都" join example_tbl03 t3 on t1.city=t3.city;
this is plan:
in this case, When the last statistical information was collected, there were no rows containing '成都', but before the next statistics collection, rows containning '成都' was inserted, wrong statistics lead to poor execution plan:
example_tbl02 will be broadcasted to other node to join, but the actual situation is example_tbl02 has many many rows contains '成都',which may cause OOM during broadcast distribution.
Search before asking
Description
In multi-table join scenario, The result of intermediate join will be used as the input of the subsequent joins.
During the interval between updating statistics, when we update data, statistics is not collected in time, If we run a multi-table join query within this time interval, the optimizer may choose a poor strategy because there is no correct statistics.
for example:
select * from example_tbl t1 join example_tbl02 t2 on t1.city=t2.city and t1.city="成都" join example_tbl03 t3 on t1.city=t3.city;
this is plan:
in this case, When the last statistical information was collected, there were no rows containing '成都', but before the next statistics collection, rows containning '成都' was inserted, wrong statistics lead to poor execution plan:
example_tbl02 will be broadcasted to other node to join, but the actual situation is example_tbl02 has many many rows contains '成都',which may cause OOM during broadcast distribution.
Solution
654ead9
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: