[Feature] Optimize `count(1)` in hdfs scanner by rewriting plan to `sum` #43616

dirtysalt · 2024-04-04T21:15:17Z

Why I'm doing:

Rigjht now hdfs scanner optimization on count(1) is to output const column of expected count.

And we can see in extreme case(large dataset), the chunk number flows in pipeline will be extremely huge, and operator time and overhead time is not neglectable.

And here is a profile of select count(*) from hive.hive_ssb100g_parquet.lineorder. To reproduce this extreme case, I've changed code to scale morsels by 20x and repeat row groups by 10x.

in concurrency=1 case , total time is 51s

         - OverheadTime: 25s37ms
           - __MAX_OF_OverheadTime: 25s111ms
           - __MIN_OF_OverheadTime: 24s962ms

             - PullTotalTime: 12s376ms
               - __MAX_OF_PullTotalTime: 13s147ms
               - __MIN_OF_PullTotalTime: 11s885ms

What I'm doing:

Rewrite the count(1) query to sum like. So each row group reader will only emit at one chunk(size = 1).

And total time is 9s.

Original plan is like

+----------------------------------+
| Explain String                   |
+----------------------------------+
| PLAN FRAGMENT 0                  |
|  OUTPUT EXPRS:18: count          |
|   PARTITION: UNPARTITIONED       |
|                                  |
|   RESULT SINK                    |
|                                  |
|   4:AGGREGATE (merge finalize)   |
|   |  output: count(18: count)    |
|   |  group by:                   |
|   |                              |
|   3:EXCHANGE                     |
|                                  |
| PLAN FRAGMENT 1                  |
|  OUTPUT EXPRS:                   |
|   PARTITION: RANDOM              |
|                                  |
|   STREAM DATA SINK               |
|     EXCHANGE ID: 03              |
|     UNPARTITIONED                |
|                                  |
|   2:AGGREGATE (update serialize) |
|   |  output: count(*)            |
|   |  group by:                   |
|   |                              |
|   1:Project                      |
|   |  <slot 20> : 1               |
|   |                              |
|   0:HdfsScanNode                 |
|      TABLE: lineorder            |
|      partitions=1/1              |
|      cardinality=600037902       |
|      avgRowSize=5.0              |
+----------------------------------+

And rewritted plan is like

+-----------------------------------+
| Explain String                    |
+-----------------------------------+
| PLAN FRAGMENT 0                   |
|  OUTPUT EXPRS:18: count           |
|   PARTITION: UNPARTITIONED        |
|                                   |
|   RESULT SINK                     |
|                                   |
|   3:AGGREGATE (merge finalize)    |
|   |  output: sum(18: count)       |
|   |  group by:                    |
|   |                               |
|   2:EXCHANGE                      |
|                                   |
| PLAN FRAGMENT 1                   |
|  OUTPUT EXPRS:                    |
|   PARTITION: RANDOM               |
|                                   |
|   STREAM DATA SINK                |
|     EXCHANGE ID: 02               |
|     UNPARTITIONED                 |
|                                   |
|   1:AGGREGATE (update serialize)  |
|   |  output: sum(19: ___count___) |
|   |  group by:                    |
|   |                               |
|   0:HdfsScanNode                  |
|      TABLE: lineorder             |
|      partitions=1/1               |
|      cardinality=1                |
|      avgRowSize=1.0               |
+-----------------------------------+

Fixes #45242

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

be/src/exec/hdfs_scanner_text.cpp

...ain/java/com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java

be/src/exec/hdfs_scanner.cpp

be/src/exec/hdfs_scanner_orc.cpp

zombee0

LGTM

...ain/java/com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java

fe/fe-core/src/main/java/com/starrocks/catalog/IcebergTable.java

Signed-off-by: yanz <dirtysalt1987@gmail.com>

sonarcloud · 2024-05-09T23:10:21Z

Quality Gate failed

Failed conditions
5.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

github-actions · 2024-05-10T00:36:48Z

[FE Incremental Coverage Report]

✅ pass : 131 / 151 (86.75%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	com/starrocks/connector/iceberg/cost/IcebergStatisticProvider.java	3	4	75.00%	[251]
🔵	com/starrocks/catalog/IcebergTable.java	4	5	80.00%	[179]
🔵	com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java	114	132	86.36%	[100, 101, 102, 110, 111, 163, 164, 168, 169, 170, 201, 217, 222, 228, 243, 245, 247, 262]
🔵	com/starrocks/sql/optimizer/Optimizer.java	3	3	100.00%	[]
🔵	com/starrocks/qe/SessionVariable.java	4	4	100.00%	[]
🔵	com/starrocks/sql/optimizer/rule/RuleSet.java	3	3	100.00%	[]

github-actions · 2024-05-10T00:42:59Z

[BE Incremental Coverage Report]

✅ pass : 100 / 109 (91.74%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	be/src/formats/orc/orc_chunk_reader.cpp	0	6	00.00%	[1222, 1223, 1224, 1225, 1226, 1227]
🔵	be/src/exec/hdfs_scanner_orc.cpp	21	22	95.45%	[571]
🔵	be/src/exec/hdfs_scanner.cpp	39	41	95.12%	[415, 453]
🔵	be/src/exec/jni_scanner.cpp	23	23	100.00%	[]
🔵	be/src/formats/parquet/file_reader.cpp	10	10	100.00%	[]
🔵	be/src/exec/hdfs_scanner_text.cpp	7	7	100.00%	[]

dirtysalt · 2024-05-14T17:19:24Z

@mergify backport branch-3.3

mergify · 2024-05-14T17:19:28Z

backport branch-3.3

✅ Backports have been created

#45618 [Feature] Optimize count(1) in hdfs scanner by rewriting plan to sum (backport #43616) has been created for branch branch-3.3 but encountered conflicts

…um` (#43616) Why I'm doing: Rigjht now hdfs scanner optimization on count(1) is to output const column of expected count. And we can see in extreme case(large dataset), the chunk number flows in pipeline will be extremely huge, and operator time and overhead time is not neglectable. And here is a profile of select count(*) from hive.hive_ssb100g_parquet.lineorder. To reproduce this extreme case, I've changed code to scale morsels by 20x and repeat row groups by 10x. in concurrency=1 case , total time is 51s - OverheadTime: 25s37ms - __MAX_OF_OverheadTime: 25s111ms - __MIN_OF_OverheadTime: 24s962ms - PullTotalTime: 12s376ms - __MAX_OF_PullTotalTime: 13s147ms - __MIN_OF_PullTotalTime: 11s885ms What I'm doing: Rewrite the count(1) query to sum like. So each row group reader will only emit at one chunk(size = 1). And total time is 9s. Original plan is like +----------------------------------+ | Explain String | +----------------------------------+ | PLAN FRAGMENT 0 | | OUTPUT EXPRS:18: count | | PARTITION: UNPARTITIONED | | | | RESULT SINK | | | | 4:AGGREGATE (merge finalize) | | | output: count(18: count) | | | group by: | | | | | 3:EXCHANGE | | | | PLAN FRAGMENT 1 | | OUTPUT EXPRS: | | PARTITION: RANDOM | | | | STREAM DATA SINK | | EXCHANGE ID: 03 | | UNPARTITIONED | | | | 2:AGGREGATE (update serialize) | | | output: count(*) | | | group by: | | | | | 1:Project | | | <slot 20> : 1 | | | | | 0:HdfsScanNode | | TABLE: lineorder | | partitions=1/1 | | cardinality=600037902 | | avgRowSize=5.0 | +----------------------------------+ And rewritted plan is like +-----------------------------------+ | Explain String | +-----------------------------------+ | PLAN FRAGMENT 0 | | OUTPUT EXPRS:18: count | | PARTITION: UNPARTITIONED | | | | RESULT SINK | | | | 3:AGGREGATE (merge finalize) | | | output: sum(18: count) | | | group by: | | | | | 2:EXCHANGE | | | | PLAN FRAGMENT 1 | | OUTPUT EXPRS: | | PARTITION: RANDOM | | | | STREAM DATA SINK | | EXCHANGE ID: 02 | | UNPARTITIONED | | | | 1:AGGREGATE (update serialize) | | | output: sum(19: ___count___) | | | group by: | | | | | 0:HdfsScanNode | | TABLE: lineorder | | partitions=1/1 | | cardinality=1 | | avgRowSize=1.0 | +-----------------------------------+ Fixes #45242 Signed-off-by: yanz <dirtysalt1987@gmail.com> (cherry picked from commit b6ca919) # Conflicts: # java-extensions/hive-reader/src/main/java/com/starrocks/hive/reader/HiveScanner.java # test/sql/test_iceberg/R/test_iceberg_catalog # test/sql/test_iceberg/T/test_iceberg_catalog

…um` (backport #43616) (#45618) Signed-off-by: yanz <dirtysalt1987@gmail.com> Co-authored-by: RyanZ <dirtysalt1987@gmail.com>

dirtysalt requested review from a team as code owners April 4, 2024 21:15

wanpengfei-git added the feature label Apr 4, 2024

wanpengfei-git requested a review from a team April 4, 2024 21:15

starrocks-cr bot reviewed Apr 4, 2024

View reviewed changes

be/src/exec/hdfs_scanner_text.cpp Outdated Show resolved Hide resolved

starrocks-cr bot reviewed Apr 4, 2024

View reviewed changes

...ain/java/com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java Show resolved Hide resolved

mergify bot assigned dirtysalt Apr 4, 2024

starrocks-cr bot reviewed Apr 4, 2024

View reviewed changes

be/src/exec/hdfs_scanner.cpp Show resolved Hide resolved

dirtysalt force-pushed the count-star-optimization branch 6 times, most recently from 96d59db to 5d2635c Compare April 11, 2024 15:49

dirtysalt enabled auto-merge (squash) April 13, 2024 02:52

mofeiatwork previously approved these changes Apr 25, 2024

View reviewed changes

Smith-Cruise reviewed Apr 30, 2024

View reviewed changes

be/src/exec/hdfs_scanner_orc.cpp Show resolved Hide resolved

zombee0 approved these changes May 6, 2024

View reviewed changes

...ain/java/com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java Show resolved Hide resolved

Youngwb reviewed May 7, 2024

View reviewed changes

...ain/java/com/starrocks/sql/optimizer/rule/transformation/RewriteSimpleAggToHDFSScanRule.java Show resolved Hide resolved

stephen-shelby previously approved these changes May 7, 2024

View reviewed changes

stephen-shelby self-requested a review May 7, 2024 02:46

stephen-shelby reviewed May 7, 2024

View reviewed changes

fe/fe-core/src/main/java/com/starrocks/catalog/IcebergTable.java Outdated Show resolved Hide resolved

Youngwb previously approved these changes May 7, 2024

View reviewed changes

dirtysalt dismissed stale reviews from Youngwb, stephen-shelby, and mofeiatwork via 626cfea May 7, 2024 16:51

Youngwb previously approved these changes May 8, 2024

View reviewed changes

dirtysalt added 8 commits May 10, 2024 01:58

support jni scanner to use count optimization

d102093

Signed-off-by: yanz <dirtysalt1987@gmail.com>

update files

a96d0e6

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix stupid bug

5563e43

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix for comment

67f4d04

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix for comment

5b9c294

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix for comment

114fd4f

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix for comment

c823cb6

Signed-off-by: yanz <dirtysalt1987@gmail.com>

fix for comment

e88bce5

Signed-off-by: yanz <dirtysalt1987@gmail.com>

dirtysalt force-pushed the count-star-optimization branch from 26ee770 to e88bce5 Compare May 9, 2024 17:58

fix test case

92dcabc

Signed-off-by: yanz <dirtysalt1987@gmail.com>

dirtysalt dismissed stale reviews from packy92 and stephen-shelby via 92dcabc May 9, 2024 23:03

imay disabled auto-merge May 10, 2024 01:26

imay merged commit b6ca919 into StarRocks:main May 10, 2024
42 of 44 checks passed

github-actions bot added the version:3.4 label May 10, 2024

wanpengfei-git mentioned this pull request May 10, 2024

optimize count(1) performance on hive/iceberg table #45242

Closed

3 tasks

dirtysalt deleted the count-star-optimization branch May 11, 2024 16:00

mergify bot mentioned this pull request May 14, 2024

[Feature] Optimize count(1) in hdfs scanner by rewriting plan to sum (backport #43616) #45618

Merged

42 tasks

wanpengfei-git pushed a commit that referenced this pull request May 14, 2024

[Feature] Optimize count(1) in hdfs scanner by rewriting plan to `s…

30dc735

…um` (backport #43616) (#45618) Signed-off-by: yanz <dirtysalt1987@gmail.com> Co-authored-by: RyanZ <dirtysalt1987@gmail.com>

github-actions bot added the 3.3-merged label May 14, 2024

dirtysalt mentioned this pull request May 14, 2024

[Feature] Optimize count(1) in hdfs scanner by rewriting plan to sum (backport #43616) #45622

Merged

24 tasks

github-actions bot added the 3.2-merged label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Optimize `count(1)` in hdfs scanner by rewriting plan to `sum` #43616

[Feature] Optimize `count(1)` in hdfs scanner by rewriting plan to `sum` #43616

dirtysalt commented Apr 4, 2024 •

edited

zombee0 left a comment

sonarcloud bot commented May 9, 2024

github-actions bot commented May 10, 2024

github-actions bot commented May 10, 2024

dirtysalt commented May 14, 2024

mergify bot commented May 14, 2024 •

edited

[Feature] Optimize count(1) in hdfs scanner by rewriting plan to sum #43616

[Feature] Optimize count(1) in hdfs scanner by rewriting plan to sum #43616

Conversation

dirtysalt commented Apr 4, 2024 • edited

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

zombee0 left a comment

Choose a reason for hiding this comment

sonarcloud bot commented May 9, 2024

Quality Gate failed

github-actions bot commented May 10, 2024

[FE Incremental Coverage Report]

file detail

github-actions bot commented May 10, 2024

[BE Incremental Coverage Report]

file detail

dirtysalt commented May 14, 2024

mergify bot commented May 14, 2024 • edited

✅ Backports have been created

[Feature] Optimize `count(1)` in hdfs scanner by rewriting plan to `sum` #43616

[Feature] Optimize `count(1)` in hdfs scanner by rewriting plan to `sum` #43616

dirtysalt commented Apr 4, 2024 •

edited

mergify bot commented May 14, 2024 •

edited