Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](join) For left semi/anti join without mark join conjunct and without other conjucnts, stop probing after matching one row #34703

Merged
merged 1 commit into from May 13, 2024

Conversation

mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented May 11, 2024

Proposed changes

If one row from the probe side has one match in the hash table, the result of this row should be determined.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@mrhhsg
Copy link
Member Author

mrhhsg commented May 11, 2024

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 41165 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a7899f014784ab039da191283ed6f283a8e11a93, data reload: false

------ Round 1 ----------------------------------
q1	17630	4615	4272	4272
q2	2018	187	184	184
q3	10581	1283	1311	1283
q4	11109	783	812	783
q5	7538	2758	2709	2709
q6	225	132	133	132
q7	1047	601	576	576
q8	9308	2160	2116	2116
q9	9161	6698	6627	6627
q10	9330	3727	3782	3727
q11	473	254	233	233
q12	456	220	227	220
q13	17767	2935	3002	2935
q14	261	222	217	217
q15	513	480	470	470
q16	525	397	384	384
q17	982	755	778	755
q18	8147	7543	7416	7416
q19	3359	1562	1540	1540
q20	649	326	317	317
q21	5191	3988	4019	3988
q22	373	295	281	281
Total cold run time: 116643 ms
Total hot run time: 41165 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4331	4255	4233	4233
q2	389	268	279	268
q3	2969	2784	2771	2771
q4	1918	1612	1622	1612
q5	5323	5331	5299	5299
q6	210	122	127	122
q7	2272	1881	1899	1881
q8	3216	3349	3350	3349
q9	8443	8372	8407	8372
q10	3923	3718	3681	3681
q11	588	483	491	483
q12	764	610	578	578
q13	16398	2961	3029	2961
q14	294	279	265	265
q15	528	484	474	474
q16	478	429	431	429
q17	1794	1508	1462	1462
q18	7836	7664	7452	7452
q19	1714	1541	1546	1541
q20	1997	1745	1787	1745
q21	5082	4934	4987	4934
q22	575	509	496	496
Total cold run time: 71042 ms
Total hot run time: 54408 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188696 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a7899f014784ab039da191283ed6f283a8e11a93, data reload: false

query1	911	373	345	345
query2	6442	2411	2471	2411
query3	6653	215	210	210
query4	25497	21178	21099	21099
query5	4133	425	413	413
query6	267	182	198	182
query7	4586	298	293	293
query8	243	191	189	189
query9	8514	2451	2446	2446
query10	441	255	263	255
query11	14767	14276	14152	14152
query12	134	99	87	87
query13	1657	403	361	361
query14	9053	8666	8727	8666
query15	252	171	175	171
query16	7869	273	260	260
query17	1840	554	558	554
query18	1467	282	279	279
query19	266	148	150	148
query20	97	87	88	87
query21	197	132	131	131
query22	5086	4830	4860	4830
query23	34251	33780	33847	33780
query24	10631	2947	2849	2849
query25	623	371	362	362
query26	1342	159	156	156
query27	2645	323	325	323
query28	7220	2060	2071	2060
query29	895	618	620	618
query30	294	151	153	151
query31	984	752	769	752
query32	89	53	57	53
query33	758	252	238	238
query34	1062	486	484	484
query35	816	683	687	683
query36	1070	912	928	912
query37	129	66	70	66
query38	2889	2767	2756	2756
query39	1619	1571	1562	1562
query40	276	124	123	123
query41	41	38	39	38
query42	103	100	100	100
query43	585	569	564	564
query44	1197	724	746	724
query45	270	253	260	253
query46	1070	726	705	705
query47	1990	1942	1905	1905
query48	374	301	291	291
query49	1173	425	389	389
query50	774	398	383	383
query51	6978	6782	6824	6782
query52	107	98	87	87
query53	358	278	292	278
query54	983	427	431	427
query55	81	74	80	74
query56	248	224	226	224
query57	1268	1186	1185	1185
query58	227	201	206	201
query59	3518	3452	3085	3085
query60	257	244	242	242
query61	95	92	95	92
query62	684	507	459	459
query63	312	283	282	282
query64	9556	7420	7409	7409
query65	3177	3115	3095	3095
query66	1272	348	331	331
query67	15571	14894	15020	14894
query68	4574	534	533	533
query69	488	307	299	299
query70	1189	1095	1137	1095
query71	379	274	267	267
query72	7335	2568	2335	2335
query73	713	338	340	338
query74	6619	6269	6136	6136
query75	3315	2671	2632	2632
query76	2843	1076	986	986
query77	420	279	279	279
query78	10531	10087	10178	10087
query79	2624	527	516	516
query80	1074	465	459	459
query81	508	222	221	221
query82	712	100	104	100
query83	251	176	171	171
query84	239	91	90	90
query85	1666	335	360	335
query86	487	290	292	290
query87	3322	3064	3137	3064
query88	4412	2402	2408	2402
query89	460	372	386	372
query90	1933	189	187	187
query91	122	102	97	97
query92	58	49	49	49
query93	1870	512	504	504
query94	1210	185	186	185
query95	407	306	301	301
query96	589	268	273	268
query97	3189	2997	2974	2974
query98	232	220	217	217
query99	1139	911	914	911
Total cold run time: 285144 ms
Total hot run time: 188696 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.65% (8980/25190)
Line Coverage: 27.32% (74245/271756)
Region Coverage: 26.56% (38381/144509)
Branch Coverage: 23.38% (19573/83716)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a7899f014784ab039da191283ed6f283a8e11a93_a7899f014784ab039da191283ed6f283a8e11a93/report/index.html

@@ -258,7 +259,7 @@ Status ProcessHashTableProbe<JoinOpType, Parent>::do_process(HashTableType& hash
need_null_map_for_probe &&
ignore_null > (hash_table_ctx.keys, hash_table_ctx.bucket_nums.data(),
probe_index, build_index, probe_rows, _probe_indexs.data(),
_probe_visited, _build_indexs.data());
_probe_visited, _build_indexs.data(), has_mark_join_conjunct);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need some regression test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some cases already covered this code in regression-test/suites/nereids_p0/join/test_mark_join.groovy, regression-test/suites/nereids_p0/subquery/subquery_unnesting.groovy.

…thout other conjucnts, stop probing after matching one row
@mrhhsg
Copy link
Member Author

mrhhsg commented May 13, 2024

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.65% (8984/25202)
Line Coverage: 27.31% (74256/271894)
Region Coverage: 26.54% (38379/144592)
Branch Coverage: 23.36% (19567/83776)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6d4c98ab27a0ce9dbcbe05c2359e8ad6ef5e194e_6d4c98ab27a0ce9dbcbe05c2359e8ad6ef5e194e/report/index.html

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 13, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit e236768 into apache:master May 13, 2024
24 of 28 checks passed
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 14, 2024
…thout other conjucnts, stop probing after matching one row (apache#34703)
@mrhhsg mrhhsg deleted the opt_half_join branch May 15, 2024 01:17
ByteYue pushed a commit to ByteYue/doris that referenced this pull request May 15, 2024
…thout other conjucnts, stop probing after matching one row (apache#34703)
yiguolei pushed a commit that referenced this pull request May 18, 2024
…thout other conjucnts, stop probing after matching one row (#34703)
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 24, 2024
…thout other conjucnts, stop probing after matching one row (apache#34703)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants