Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhencement](Nereids) rules to optimize length function with aggregate function #33837

Closed
wants to merge 5 commits into from

Conversation

LiBinfeng-01
Copy link
Contributor

@LiBinfeng-01 LiBinfeng-01 commented Apr 18, 2024

Add two rules in order to optimize plan in clickbench q27

  • change expression string = "" to length(string) <>0
  • add length function in aggregate function agg(length) to child projection

example:
select t1.string from t1 where t1.string <> ""; ==>select t1.string from t1 where length(t1.string) > 0;
select avg(length(t1.string)) from t1;
from agg(length) to agg->project(length)

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@englefly
Copy link
Contributor

it is better to do this transformation:
t1.string = ""
=>
length(t1.string)=0

It covers more cases.

if (not.getArgument(0) instanceof EqualPredicate) {
EqualPredicate equalPredicate = (EqualPredicate) not.getArgument(0);
if (equalPredicate.getArgument(0).getDataType().isStringType()
&& equalPredicate.getArgument(1).equals(new StringLiteral(""))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about varchar or char literal? should check StringLikeLiteral and getStringValue from it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

EqualPredicate equalPredicate = (EqualPredicate) not.getArgument(0);
if (equalPredicate.getArgument(0).getDataType().isStringType()
&& equalPredicate.getArgument(1).equals(new StringLiteral(""))) {
expr = new GreaterThan(new Length(equalPredicate.getArgument(0)), new IntegerLiteral(0), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use two arg ctor is enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@LiBinfeng-01
Copy link
Contributor Author

it is better to do this transformation: t1.string = "" => length(t1.string)=0

It covers more cases.

have changed and add more cases abort that

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@LiBinfeng-01 LiBinfeng-01 force-pushed the add_string_to_length branch 2 times, most recently from ecd5ef2 to ae32abc Compare April 22, 2024 08:35
@LiBinfeng-01
Copy link
Contributor Author

run buildall

@LiBinfeng-01 LiBinfeng-01 changed the title [Enhencement](Nereids) string compare with empty string convert to length greater than zero [Enhencement](Nereids) rules to optimize length function with aggregate function Apr 22, 2024
@doris-robot
Copy link

TPC-H: Total hot run time: 38213 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ae32abc61580a1a4896321c90af232cded608557, data reload: false

------ Round 1 ----------------------------------
q1	17604	4251	4202	4202
q2	2001	184	188	184
q3	10445	1097	1149	1097
q4	10192	802	724	724
q5	7479	2674	2624	2624
q6	215	131	133	131
q7	996	599	581	581
q8	9223	2024	2041	2024
q9	7223	6592	6502	6502
q10	8482	3536	3470	3470
q11	463	233	231	231
q12	474	223	211	211
q13	18106	2937	2951	2937
q14	259	223	238	223
q15	526	482	481	481
q16	516	383	374	374
q17	950	756	675	675
q18	7221	6758	6690	6690
q19	7140	1506	1478	1478
q20	636	315	301	301
q21	3503	2774	2823	2774
q22	361	299	303	299
Total cold run time: 114015 ms
Total hot run time: 38213 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4310	4217	4221	4217
q2	373	269	269	269
q3	2997	2722	2753	2722
q4	1836	1557	1591	1557
q5	5303	5305	5260	5260
q6	205	123	123	123
q7	2249	1879	1857	1857
q8	3206	3376	3296	3296
q9	8583	8555	8524	8524
q10	4071	3828	3905	3828
q11	631	519	499	499
q12	816	623	643	623
q13	16221	3154	3148	3148
q14	324	279	261	261
q15	509	492	495	492
q16	506	450	448	448
q17	1791	1511	1483	1483
q18	8035	7987	7802	7802
q19	1632	1566	1550	1550
q20	2073	1851	1821	1821
q21	5114	4897	5020	4897
q22	529	478	457	457
Total cold run time: 71314 ms
Total hot run time: 55134 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185541 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ae32abc61580a1a4896321c90af232cded608557, data reload: false

query1	898	371	375	371
query2	6195	2716	2577	2577
query3	6645	204	203	203
query4	23754	21295	21347	21295
query5	4125	396	418	396
query6	257	191	166	166
query7	4582	288	287	287
query8	241	199	190	190
query9	8709	2352	2335	2335
query10	402	263	251	251
query11	14810	14187	14225	14187
query12	134	87	85	85
query13	1632	361	349	349
query14	9369	7929	7816	7816
query15	296	176	192	176
query16	8182	259	261	259
query17	1937	574	555	555
query18	2106	276	272	272
query19	298	168	153	153
query20	92	85	84	84
query21	198	130	125	125
query22	5006	4810	4820	4810
query23	33767	33161	33183	33161
query24	11151	3064	3079	3064
query25	647	384	382	382
query26	704	157	159	157
query27	2326	368	369	368
query28	6017	2072	2085	2072
query29	885	621	622	621
query30	278	184	177	177
query31	944	772	736	736
query32	97	49	55	49
query33	655	247	249	247
query34	920	477	491	477
query35	853	723	721	721
query36	1095	941	952	941
query37	111	76	72	72
query38	3476	3369	3335	3335
query39	1635	1567	1570	1567
query40	179	122	125	122
query41	46	43	44	43
query42	101	97	97	97
query43	576	545	556	545
query44	1161	738	742	738
query45	289	259	264	259
query46	1108	749	730	730
query47	2030	1936	1932	1932
query48	391	295	303	295
query49	818	411	395	395
query50	787	377	384	377
query51	6916	6784	6801	6784
query52	99	96	88	88
query53	350	280	277	277
query54	293	229	235	229
query55	81	71	70	70
query56	237	218	223	218
query57	1178	1161	1131	1131
query58	215	214	187	187
query59	3457	3057	3065	3057
query60	252	230	232	230
query61	87	88	89	88
query62	623	428	431	428
query63	303	279	276	276
query64	4954	3848	3924	3848
query65	3060	3029	3010	3010
query66	753	321	328	321
query67	15358	14930	14868	14868
query68	6694	534	526	526
query69	527	298	299	298
query70	1261	1189	1129	1129
query71	1471	1257	1269	1257
query72	6577	2717	2389	2389
query73	724	315	313	313
query74	6792	6479	6378	6378
query75	3840	2628	2607	2607
query76	4280	1061	970	970
query77	615	263	262	262
query78	10947	10278	10146	10146
query79	7772	510	507	507
query80	1410	441	428	428
query81	519	237	243	237
query82	876	94	90	90
query83	247	164	161	161
query84	259	82	84	82
query85	1187	264	264	264
query86	441	305	293	293
query87	3488	3283	3285	3283
query88	4734	2309	2300	2300
query89	487	376	363	363
query90	1925	185	180	180
query91	130	95	99	95
query92	54	48	44	44
query93	6078	510	495	495
query94	988	180	178	178
query95	389	300	300	300
query96	595	264	260	260
query97	3093	2907	2959	2907
query98	233	211	218	211
query99	1236	856	856	856
Total cold run time: 291000 ms
Total hot run time: 185541 ms

englefly
englefly previously approved these changes Apr 22, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Apr 22, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 22, 2024
@doris-robot
Copy link

TPC-H: Total hot run time: 38516 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ea828ba51fa4d85d6ba47ea7b4bb3af4678a86cf, data reload: false

------ Round 1 ----------------------------------
q1	17606	4356	4263	4263
q2	1999	192	182	182
q3	10458	1150	1174	1150
q4	10193	792	754	754
q5	7503	2690	2616	2616
q6	225	131	134	131
q7	997	611	578	578
q8	9220	2043	2026	2026
q9	7317	6617	6512	6512
q10	8552	3545	3511	3511
q11	442	235	225	225
q12	445	228	215	215
q13	17861	2957	2935	2935
q14	276	222	227	222
q15	528	475	478	475
q16	523	376	377	376
q17	967	650	751	650
q18	7279	6721	6740	6721
q19	7512	1540	1541	1540
q20	649	307	303	303
q21	3462	2828	2832	2828
q22	361	303	321	303
Total cold run time: 114375 ms
Total hot run time: 38516 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4386	4250	4254	4250
q2	369	258	258	258
q3	2972	2743	2812	2743
q4	1871	1557	1603	1557
q5	5313	5326	5329	5326
q6	217	124	123	123
q7	2204	1850	1834	1834
q8	3208	3367	3353	3353
q9	8592	8529	8621	8529
q10	4080	3893	4054	3893
q11	611	484	495	484
q12	792	641	638	638
q13	16780	3266	3127	3127
q14	318	302	290	290
q15	540	491	490	490
q16	486	445	440	440
q17	1791	1511	1462	1462
q18	8189	7853	7814	7814
q19	1670	1551	1595	1551
q20	2066	1859	1817	1817
q21	5172	4920	4940	4920
q22	567	474	480	474
Total cold run time: 72194 ms
Total hot run time: 55373 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185231 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ea828ba51fa4d85d6ba47ea7b4bb3af4678a86cf, data reload: false

query1	894	365	367	365
query2	6215	2769	2335	2335
query3	6654	213	208	208
query4	22966	21182	21316	21182
query5	4126	410	420	410
query6	274	188	175	175
query7	4585	282	286	282
query8	246	197	192	192
query9	8403	2274	2252	2252
query10	424	259	251	251
query11	14736	14203	14139	14139
query12	130	90	85	85
query13	1622	355	354	354
query14	9446	7770	7025	7025
query15	253	176	179	176
query16	8166	260	259	259
query17	1925	571	543	543
query18	2109	269	259	259
query19	327	152	148	148
query20	90	87	82	82
query21	199	123	131	123
query22	5139	4878	4887	4878
query23	33593	33249	33172	33172
query24	11118	3032	3036	3032
query25	595	390	378	378
query26	1111	167	165	165
query27	2404	358	382	358
query28	7073	2040	2025	2025
query29	875	633	622	622
query30	335	178	183	178
query31	960	772	763	763
query32	93	54	52	52
query33	756	247	253	247
query34	1071	484	495	484
query35	825	718	711	711
query36	1102	930	925	925
query37	120	77	79	77
query38	3555	3424	3326	3326
query39	1606	1584	1598	1584
query40	170	121	126	121
query41	44	42	42	42
query42	110	98	93	93
query43	573	546	546	546
query44	1271	768	752	752
query45	280	266	287	266
query46	1101	735	735	735
query47	2050	1980	1955	1955
query48	370	296	304	296
query49	860	400	399	399
query50	781	396	402	396
query51	6989	6624	6808	6624
query52	101	90	89	89
query53	345	271	276	271
query54	312	232	227	227
query55	75	72	70	70
query56	234	224	214	214
query57	1216	1142	1116	1116
query58	233	198	203	198
query59	3329	3179	3323	3179
query60	238	227	226	226
query61	90	83	101	83
query62	611	455	442	442
query63	298	272	277	272
query64	4784	3959	4001	3959
query65	3079	3041	3043	3041
query66	754	337	337	337
query67	15403	15130	15159	15130
query68	5177	554	535	535
query69	482	307	309	307
query70	1325	1211	1089	1089
query71	1416	1262	1266	1262
query72	6359	2624	2459	2459
query73	713	322	317	317
query74	6809	6583	6568	6568
query75	3354	2605	2585	2585
query76	3269	963	946	946
query77	371	267	261	261
query78	10924	10245	10229	10229
query79	7689	513	517	513
query80	1467	440	429	429
query81	516	246	243	243
query82	859	94	92	92
query83	198	169	163	163
query84	262	87	83	83
query85	1246	265	264	264
query86	402	297	310	297
query87	3458	3369	3356	3356
query88	5244	2394	2412	2394
query89	473	366	364	364
query90	2005	179	179	179
query91	127	109	96	96
query92	53	45	46	45
query93	5719	499	501	499
query94	1001	176	176	176
query95	380	300	317	300
query96	607	263	263	263
query97	3180	2932	2944	2932
query98	232	212	212	212
query99	1238	883	872	872
Total cold run time: 288377 ms
Total hot run time: 185231 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.9 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ea828ba51fa4d85d6ba47ea7b4bb3af4678a86cf, data reload: false

query1	0.03	0.02	0.03
query2	0.08	0.04	0.05
query3	0.23	0.06	0.05
query4	1.67	0.07	0.08
query5	0.50	0.49	0.50
query6	1.48	0.74	0.72
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.55	0.50	0.49
query10	0.55	0.56	0.54
query11	0.22	0.17	0.17
query12	0.20	0.17	0.18
query13	0.60	0.59	0.57
query14	0.77	0.77	0.77
query15	0.83	0.82	0.80
query16	0.36	0.36	0.35
query17	0.96	0.98	0.99
query18	0.21	0.26	0.23
query19	1.79	1.66	1.69
query20	0.02	0.01	0.01
query21	15.42	0.66	0.65
query22	0.82	0.70	0.69
query23	18.79	1.79	1.50
query24	1.44	0.35	0.24
query25	0.11	0.08	0.07
query26	1.06	0.18	0.18
query27	0.07	0.06	0.07
query28	13.52	0.45	0.42
query29	12.55	5.60	5.63
query30	0.25	0.06	0.05
query31	2.87	0.34	0.34
query32	3.36	0.45	0.45
query33	2.83	2.85	2.81
query34	17.07	4.42	4.39
query35	4.52	4.48	4.44
query36	0.66	0.46	0.48
query37	0.09	0.06	0.05
query38	0.05	0.04	0.03
query39	0.05	0.04	0.03
query40	0.17	0.15	0.14
query41	0.10	0.05	0.05
query42	0.05	0.04	0.05
query43	0.04	0.03	0.03
Total cold run time: 107.01 s
Total hot run time: 30.9 s

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@py023
Copy link
Contributor

py023 commented Apr 23, 2024

run external
run feut

@LiBinfeng-01
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 29.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d47e3c1eae93db1018649e26f83b2f22ecd05bf0, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.03
query3	0.23	0.06	0.05
query4	1.67	0.08	0.07
query5	0.51	0.50	0.53
query6	1.32	0.88	0.83
query7	0.02	0.02	0.01
query8	0.04	0.04	0.04
query9	0.51	0.46	0.46
query10	0.51	0.52	0.50
query11	0.20	0.15	0.16
query12	0.19	0.17	0.16
query13	0.65	0.65	0.66
query14	0.96	1.04	1.06
query15	0.86	0.85	0.85
query16	0.36	0.37	0.37
query17	0.96	1.01	1.05
query18	0.22	0.20	0.26
query19	1.88	1.81	1.77
query20	0.02	0.01	0.01
query21	15.42	0.66	0.66
query22	0.81	0.72	0.70
query23	18.76	1.61	1.48
query24	1.90	0.26	0.25
query25	0.11	0.08	0.08
query26	0.74	0.20	0.19
query27	0.08	0.07	0.07
query28	13.52	0.44	0.43
query29	12.65	3.21	3.33
query30	0.26	0.07	0.06
query31	2.84	0.37	0.38
query32	3.30	0.46	0.46
query33	3.11	2.92	2.86
query34	17.25	4.45	4.66
query35	4.54	4.59	4.46
query36	0.66	0.47	0.47
query37	0.10	0.07	0.07
query38	0.07	0.05	0.06
query39	0.06	0.05	0.05
query40	0.18	0.15	0.16
query41	0.11	0.07	0.06
query42	0.07	0.06	0.06
query43	0.05	0.04	0.04
Total cold run time: 107.82 s
Total hot run time: 29.41 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants