Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use eager-input-filter optimize TopN #1501

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

woxiaosa
Copy link

@woxiaosa woxiaosa commented Aug 9, 2023

Task Description

close #1500

use eager-input-filter to optimize TopN.

this optimization can reduce the amount of dumped data in TopN

Solution Description

I created a class EagerInputFilter to collect statistics of dumped data, thereby reducing the amount of dumped.

Passed Regressions

pass tests in ob repo

Upgrade Compatibility

This optimization is orthogonal to the implementation of Sort, so it's easy to integrate this optimization.

Release Note

In TopN with a large amount of data, this optimization can speed up 1.3-1.6X.

@CLAassistant
Copy link

CLAassistant commented Aug 9, 2023

CLA assistant check
All committers have signed the CLA.

@woxiaosa woxiaosa force-pushed the issue_1500 branch 3 times, most recently from 8933052 to 9e4bba1 Compare August 14, 2023 02:06
@woxiaosa woxiaosa force-pushed the issue_1500 branch 2 times, most recently from 8997937 to 1bcea63 Compare August 24, 2023 03:43
src/sql/engine/sort/ob_sort_op_impl.cpp Outdated Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.cpp Outdated Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.cpp Outdated Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.cpp Outdated Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.cpp Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.cpp Outdated Show resolved Hide resolved
LOG_DEBUG("is not initialized", K(ret));
} else if ((seq_no + 1) % per_bucket_num_ == 0) {
if (heap_->count() < bucket_num_) {
if (use_heap_sort && OB_FAIL(generate_new_row(static_cast<SortStoredRow *>(dump_row),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always copy from StoredRow here, decoupling with the outside heap sort logic.

src/sql/engine/sort/ob_sort_op_impl.h Outdated Show resolved Hide resolved
src/sql/engine/sort/ob_sort_op_impl.h Outdated Show resolved Hide resolved
@woxiaosa woxiaosa changed the title use self-sharp-filter optimize TopN use eager-input-filter optimize TopN Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement]: optimize TopN by eager-input-filter
3 participants