Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate Material Above Nodes #1437

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zhangyuqin1998
Copy link

@zhangyuqin1998 zhangyuqin1998 commented Jun 16, 2023

Task Description

In the volcano model, the pipeline's working mode makes it difficult for us to analyze the actual execution performance of a certain operator. To clarify the data flow of the operator on the timeline for the purpose of analyzing its actual execution performance, material operators can be inserted before and after the operator to divide the data flow.
close #1454

Solution Description

Using hint to insert material operator, using /*+ blocking('all') */ to insert material operator above all operators, using /*+ blocking('dfo') */ to insert material operator above exchange in operators, using /*+ blocking(4, 5, 6) */ to insert material insert material operators before specific operators.

To achieve this function, we added a TraverseOp named ALLOC_OP in ObLogPlan::plan_traverse_loop . While passing the logical plan tree, we will do some prepare job and decide whether to insert a material operator.

Besides, we can not insert material ops in some situations, such as right child of nested loop join and sub plan filter, and it is difficult to insert material ops for remote plan. A whitelist mechanism based on pattern matching is established to recognize these situations

Passed Regressions

N/A

Upgrade Compatibility

N/A

Other Information

before inserting material
image
image

after inserting material
image
image

we can see that the pipeline is divided after inserting material operators

Release Note

N/A

Copy link
Contributor

@raywill raywill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请分别按照 review 修改。特别注意:

  1. 一些成员,不是那么明显的,添加适当注释
  2. 错误码覆盖问题
  3. push_back 等函数调用的错误码返回值检查问题。如果确认不需要检查,那么就应该返回void,或者在调用处这么写: (void) func();

@@ -165,7 +165,7 @@ enum ObMatchAgainstMode {

#define IS_OUTER_OR_CONNECT_BY_JOIN(join_type) (IS_OUTER_JOIN(join_type) || CONNECT_BY_JOIN == join_type)

#define IS_DUMMY_PHY_OPERATOR(op_type) ((op_type == PHY_MONITORING_DUMP))
#define IS_DUMMY_PHY_OPERATOR(op_type) ((op_type == PHY_MONITORING_DUMP || op_type == PHY_MATERIAL))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方会不会有点危险? PHY_MATERIAL 并不总是 DUMMY 算子,有些时候是实际有用的常规算子。

int ObOperator::get_real_child(ObOperator *&child, const int32_t child_idx) 请确认下这里是否安全。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个地方会不会有点危险? PHY_MATERIAL 并不总是 DUMMY 算子,有些时候是实际有用的常规算子。

int ObOperator::get_real_child(ObOperator *&child, const int32_t child_idx) 请确认下这里是否安全。

确实是这样,这里我先改回去了

@@ -11096,7 +11095,8 @@ int ObLogPlan::generate_plan()
} else if (OB_FAIL(plan_traverse_loop(RUNTIME_FILTER,
ALLOC_GI,
PX_PIPE_BLOCKING,
ALLOC_MONITORING_DUMP,
OPERATOR_NUMBERING,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

做两次 OPERATOR_NUMBERING?为什么?必要性是?原来为什么不需要?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

做两次 OPERATOR_NUMBERING?为什么?必要性是?原来为什么不需要?

因为白名单和枚举的插入级别需要根据原始op id判断是否插入算子,所以需要先OPERATOR_NUMBERING分配原始的op id。插入物化算子后,再做一次OPERATOR_NUMBERING生成新的op id。
我后面在代码里加下注释解释这个事情~

} else if (NULL != node->str_value_){
ObString alloc_level;
alloc_level.assign_ptr(node->str_value_, static_cast<int32_t>(node->str_len_));
if (0 == alloc_level.case_compare("all") || 0 == alloc_level.case_compare("ALL")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么不用 case 不敏感的 compare 函数?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么不用 case 不敏感的 compare 函数?

这里是我搞错了,case_compare本身就是大小写不敏感,已经修改~

}
}
if (OB_FAIL(alloc_op_hints.push_back(alloc_op_hint))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新手写 OB 代码最常犯的一个错误:错误码覆盖。假如上面的分支抛出了一个错误(L13599),这里 OB_FAIL 宏会覆盖这个错误。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新手写 OB 代码最常犯的一个错误:错误码覆盖。假如上面的分支抛出了一个错误(L13599),这里 OB_FAIL 宏会覆盖这个错误。

好滴,我把其他地方也检查一遍~

src/sql/resolver/dml/ob_hint.cpp Outdated Show resolved Hide resolved
src/sql/optimizer/ob_logical_operator.cpp Outdated Show resolved Hide resolved
src/sql/optimizer/ob_logical_operator.cpp Outdated Show resolved Hide resolved
src/sql/optimizer/ob_logical_operator.cpp Outdated Show resolved Hide resolved
src/sql/optimizer/ob_logical_operator.cpp Outdated Show resolved Hide resolved
// disable all nodes behind right child
OZ(get_child(second_child)->recursively_disable_alloc_op_above(ctx));
}
if (get_plan()->get_optimizer_context().get_exec_ctx()->get_sql_ctx()->is_remote_sql_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个和上面是 else if 关系还是,还是分段判断关系?这个 if 里面,有可能覆盖上面的错误码。

错误码覆盖,是非常大的坑。可能会导致一些线上非常难排查的问题。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个和上面是 else if 关系还是,还是分段判断关系?这个 if 里面,有可能覆盖上面的错误码。

错误码覆盖,是非常大的坑。可能会导致一些线上非常难排查的问题。

是分段关系,我处理一下错误码的问题

LOG_WARN("failed to print blocking hint", K(ret));
}
}
if (OB_FAIL(BUF_PRINTF(")"))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

错误码覆盖

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

错误码覆盖

done

src/sql/optimizer/ob_logical_operator.cpp Outdated Show resolved Hide resolved
if (expr->get_expr_type() == T_FUN_SYS_ROWNUM) {
found = true;
} else {
for (auto i = 0; !found && i < expr->get_param_count(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里,不同循环次数之间也可能发生错误码覆盖?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

LOG_WARN("fail to find rownum expr", K(ret));
} else {
if (found) {
for (int64_t i = 0; i < cur_path.count(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

错误码覆盖?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

NESTED_LOOP_JOIN == static_cast<ObLogJoin*>(this)->get_join_algo())) {
OB_ASSERT(get_num_of_child() > 1 && nullptr != get_child(second_child));
// disable all nodes behind right child
if (OB_FAIL(get_child(second_child)->recursively_disable_alloc_op_above(ctx))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

覆盖了 5453 行的错误码。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -4406,6 +4443,41 @@ void ObLogicalOperator::set_parent(ObLogicalOperator *parent)
parent_ = parent;
}

int ObLogicalOperator::allocate_material_node_above()
{
int ret = OB_SUCCESS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加注释,说明下列情况是做什么,处理什么场景

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zhangyuqin1998 zhangyuqin1998 force-pushed the material branch 2 times, most recently from 4c960ef to 6fbf625 Compare July 4, 2023 09:07
@zhangyuqin1998 zhangyuqin1998 force-pushed the material branch 4 times, most recently from bf2866d to 76e6612 Compare July 11, 2023 05:46
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


zhangyuqin1998 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Allocate Material Above Nodes
3 participants