Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](functions) impl scalar functions translate and url_encode #40567

Merged
merged 31 commits into from
Sep 19, 2024

Conversation

suxiaogang223
Copy link
Contributor

@suxiaogang223 suxiaogang223 commented Sep 9, 2024

Proposed changes

impl translate and url_encode for presto

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@suxiaogang223 suxiaogang223 marked this pull request as draft September 9, 2024 14:16
@LiBinfeng-01
Copy link
Contributor

should also implement fold constant on fe
example: ExecutableFunctions.java : acos

@suxiaogang223 suxiaogang223 marked this pull request as ready for review September 10, 2024 09:08
@suxiaogang223
Copy link
Contributor Author

should also implement fold constant on fe example: ExecutableFunctions.java : acos

Done, and is there a way to test fold constant?

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.82% (9399/25525)
Line Coverage: 28.23% (77524/274612)
Region Coverage: 27.63% (40012/144836)
Branch Coverage: 24.25% (20345/83910)
Coverage Report: http://coverage.selectdb-in.cc/coverage/7bf384f5d15cefed4c75afea6249e85c680f2096_7bf384f5d15cefed4c75afea6249e85c680f2096/report/index.html

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.87% (9453/25641)
Line Coverage: 28.23% (77698/275246)
Region Coverage: 27.63% (40105/145151)
Branch Coverage: 24.24% (20375/84038)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4c9f50b949d48b6abfb2bf4958cc05015fa6753b_4c9f50b949d48b6abfb2bf4958cc05015fa6753b/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 38296 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4c9f50b949d48b6abfb2bf4958cc05015fa6753b, data reload: false

------ Round 1 ----------------------------------
q1	17740	4403	4322	4322
q2	2032	198	188	188
q3	11760	948	1182	948
q4	10522	746	737	737
q5	7748	2877	2864	2864
q6	229	145	141	141
q7	971	618	595	595
q8	9343	2120	2100	2100
q9	7095	6590	6546	6546
q10	7004	2240	2251	2240
q11	455	242	248	242
q12	406	230	232	230
q13	19084	3134	3154	3134
q14	289	241	235	235
q15	535	507	492	492
q16	530	428	428	428
q17	999	717	708	708
q18	7316	6920	7026	6920
q19	1406	1001	1045	1001
q20	678	333	324	324
q21	3861	3192	2906	2906
q22	1111	995	1036	995
Total cold run time: 111114 ms
Total hot run time: 38296 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4363	4291	4294	4291
q2	394	282	271	271
q3	2881	2677	2657	2657
q4	1927	1655	1645	1645
q5	5704	5742	5834	5742
q6	223	144	139	139
q7	2232	1869	1907	1869
q8	3287	3454	3499	3454
q9	8899	8848	8859	8848
q10	3595	3409	3297	3297
q11	609	513	517	513
q12	829	676	652	652
q13	15505	3261	3329	3261
q14	325	300	286	286
q15	542	512	493	493
q16	567	499	500	499
q17	1849	1574	1562	1562
q18	8238	7872	8037	7872
q19	1753	1566	1591	1566
q20	2191	1913	1959	1913
q21	5878	5524	5324	5324
q22	1152	1053	1029	1029
Total cold run time: 72943 ms
Total hot run time: 57183 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196940 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4c9f50b949d48b6abfb2bf4958cc05015fa6753b, data reload: false

query1	1276	887	851	851
query2	6437	1996	1887	1887
query3	10644	4051	4014	4014
query4	60180	27436	23211	23211
query5	5094	498	502	498
query6	400	157	163	157
query7	5644	303	293	293
query8	316	228	247	228
query9	7594	2497	2504	2497
query10	411	265	261	261
query11	16385	15134	15219	15134
query12	152	106	109	106
query13	1447	415	397	397
query14	10207	7318	7011	7011
query15	224	178	180	178
query16	6771	474	486	474
query17	1104	574	568	568
query18	1530	316	312	312
query19	213	154	160	154
query20	122	116	111	111
query21	215	109	111	109
query22	4594	4465	4581	4465
query23	34250	33525	33484	33484
query24	5979	2935	2811	2811
query25	516	404	414	404
query26	622	153	157	153
query27	1630	282	280	280
query28	3780	2067	2045	2045
query29	661	430	441	430
query30	239	159	149	149
query31	934	770	771	770
query32	76	55	56	55
query33	446	320	296	296
query34	873	477	463	463
query35	836	733	714	714
query36	1060	951	925	925
query37	141	89	83	83
query38	3962	3868	3948	3868
query39	1487	1378	1401	1378
query40	203	119	117	117
query41	48	49	46	46
query42	114	96	97	96
query43	511	483	496	483
query44	1107	778	740	740
query45	201	169	174	169
query46	1098	717	753	717
query47	1946	1817	1849	1817
query48	372	297	306	297
query49	793	456	456	456
query50	835	418	413	413
query51	6937	6911	7114	6911
query52	104	87	86	86
query53	252	182	179	179
query54	566	467	468	467
query55	79	76	77	76
query56	298	270	262	262
query57	1156	1125	1113	1113
query58	230	374	260	260
query59	3064	2796	2863	2796
query60	287	275	265	265
query61	101	96	98	96
query62	728	665	647	647
query63	220	185	184	184
query64	1369	664	660	660
query65	3216	3117	3187	3117
query66	692	336	329	329
query67	15874	15507	15170	15170
query68	2093	547	549	547
query69	413	272	272	272
query70	1162	1081	1139	1081
query71	329	276	268	268
query72	4856	4114	3973	3973
query73	744	320	324	320
query74	9100	8918	8988	8918
query75	3340	2680	2688	2680
query76	1366	973	1037	973
query77	514	324	314	314
query78	10063	9873	9219	9219
query79	1221	864	859	859
query80	1051	816	808	808
query81	548	261	251	251
query82	1359	266	266	266
query83	227	187	186	186
query84	269	102	106	102
query85	705	383	427	383
query86	325	319	302	302
query87	4421	4339	4335	4335
query88	4587	4061	4039	4039
query89	392	371	376	371
query90	1862	309	304	304
query91	120	125	125	125
query92	82	77	76	76
query93	959	909	917	909
query94	715	363	390	363
query95	441	409	403	403
query96	468	468	469	468
query97	3203	3120	3120	3120
query98	231	233	241	233
query99	1497	1317	1279	1279
Total cold run time: 299516 ms
Total hot run time: 196940 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.67 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4c9f50b949d48b6abfb2bf4958cc05015fa6753b, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.06
query4	1.68	0.07	0.07
query5	0.50	0.50	0.50
query6	1.13	0.73	0.73
query7	0.01	0.01	0.02
query8	0.06	0.05	0.05
query9	0.53	0.49	0.49
query10	0.54	0.56	0.55
query11	0.17	0.12	0.12
query12	0.15	0.13	0.12
query13	0.60	0.59	0.58
query14	1.40	1.42	1.40
query15	0.88	0.82	0.82
query16	0.38	0.36	0.38
query17	1.02	1.00	1.03
query18	0.17	0.18	0.18
query19	1.94	1.74	1.71
query20	0.02	0.01	0.02
query21	15.39	0.70	0.68
query22	4.10	7.28	2.24
query23	18.23	1.45	1.25
query24	2.04	0.23	0.23
query25	0.15	0.08	0.08
query26	0.26	0.18	0.18
query27	0.07	0.07	0.07
query28	13.28	1.03	1.00
query29	12.65	3.35	3.31
query30	0.24	0.06	0.05
query31	2.87	0.41	0.41
query32	3.24	0.48	0.49
query33	2.99	3.02	2.95
query34	17.09	4.41	4.46
query35	4.44	4.44	4.46
query36	0.66	0.48	0.48
query37	0.18	0.16	0.15
query38	0.15	0.15	0.14
query39	0.05	0.04	0.04
query40	0.16	0.12	0.14
query41	0.10	0.05	0.06
query42	0.06	0.04	0.04
query43	0.04	0.04	0.04
Total cold run time: 109.98 s
Total hot run time: 31.67 s

@LiBinfeng-01
Copy link
Contributor

should also implement fold constant on fe example: ExecutableFunctions.java : acos

Done, and is there a way to test fold constant?

you can add cases to nereids/p0/expression/fold_constant and need to check boundary: null, unnormal url string etc. like other guys mentions before, and use testFoldConst function in Suite.groovy to check whether the result of fe fold constant is same as be execution

be/src/vec/functions/math.cpp Outdated Show resolved Hide resolved
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
be/src/util/url_coding.cpp Outdated Show resolved Hide resolved
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
@suxiaogang223 suxiaogang223 changed the title [feature](functions) impl scalar functions is_nan, translate and url_encode [feature](functions) impl scalar functions translate and url_encode Sep 12, 2024
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
be/src/util/url_coding.cpp Outdated Show resolved Hide resolved
be/src/vec/functions/function_string.h Outdated Show resolved Hide resolved
@suxiaogang223
Copy link
Contributor Author

run external

@suxiaogang223
Copy link
Contributor Author

run buildall

3 similar comments
@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.89% (9470/25670)
Line Coverage: 28.26% (77857/275541)
Region Coverage: 27.66% (40209/145346)
Branch Coverage: 24.26% (20420/84164)
Coverage Report: http://coverage.selectdb-in.cc/coverage/13f5d7ce12369c65f2b5995a4ec9a295d7054232_13f5d7ce12369c65f2b5995a4ec9a295d7054232/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 42646 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 13f5d7ce12369c65f2b5995a4ec9a295d7054232, data reload: false

------ Round 1 ----------------------------------
q1	17568	7328	7214	7214
q2	2049	184	177	177
q3	10483	1280	1389	1280
q4	10521	972	1012	972
q5	7734	3137	3127	3127
q6	242	154	150	150
q7	1044	648	613	613
q8	9448	2010	1986	1986
q9	6797	6300	6331	6300
q10	7030	2509	2521	2509
q11	431	248	246	246
q12	402	229	228	228
q13	17760	2984	3019	2984
q14	291	258	255	255
q15	574	525	538	525
q16	515	422	429	422
q17	981	951	953	951
q18	7406	6758	6657	6657
q19	1381	1225	1222	1222
q20	610	342	327	327
q21	3863	3511	3517	3511
q22	1085	990	1013	990
Total cold run time: 108215 ms
Total hot run time: 42646 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7201	7190	7137	7137
q2	338	239	244	239
q3	3076	3100	3059	3059
q4	2070	2055	2062	2055
q5	5673	5596	5646	5596
q6	243	146	147	146
q7	2136	1776	1745	1745
q8	3411	3386	3392	3386
q9	8700	8864	8693	8693
q10	3525	3522	3564	3522
q11	585	485	518	485
q12	812	625	675	625
q13	10136	3200	3177	3177
q14	308	296	269	269
q15	606	545	548	545
q16	523	477	461	461
q17	1805	1737	1780	1737
q18	8607	8195	7930	7930
q19	1762	1762	1760	1760
q20	2109	1871	1859	1859
q21	5831	5525	5489	5489
q22	1101	1001	1044	1001
Total cold run time: 70558 ms
Total hot run time: 60916 ms

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41717 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 72812ee7a05ece367c585f61bb0edad1fc696e3b, data reload: false

------ Round 1 ----------------------------------
q1	17579	7304	7363	7304
q2	2054	163	163	163
q3	10819	1079	1171	1079
q4	10449	726	739	726
q5	7788	3097	3072	3072
q6	234	148	156	148
q7	1013	614	590	590
q8	9428	2058	2071	2058
q9	6820	6441	6435	6435
q10	7042	2259	2333	2259
q11	443	246	244	244
q12	414	226	222	222
q13	17784	2985	3016	2985
q14	233	211	227	211
q15	580	509	518	509
q16	707	621	617	617
q17	981	831	790	790
q18	7320	6602	6752	6602
q19	1389	1003	1082	1003
q20	575	306	292	292
q21	4149	3390	3392	3390
q22	1093	1023	1018	1018
Total cold run time: 108894 ms
Total hot run time: 41717 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7178	7218	7883	7218
q2	329	235	233	233
q3	3085	2981	3002	2981
q4	2161	1869	1824	1824
q5	5571	5600	5640	5600
q6	245	146	147	146
q7	2216	1790	1788	1788
q8	3350	3406	3426	3406
q9	8772	8967	8750	8750
q10	3510	3435	3498	3435
q11	594	476	490	476
q12	827	617	588	588
q13	10164	3190	3194	3190
q14	312	278	264	264
q15	586	532	535	532
q16	706	712	689	689
q17	1801	1601	1587	1587
q18	8360	7811	7755	7755
q19	1728	1656	1572	1572
q20	2165	1896	1886	1886
q21	5757	5291	5613	5291
q22	1109	1069	1040	1040
Total cold run time: 70526 ms
Total hot run time: 60251 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198232 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 72812ee7a05ece367c585f61bb0edad1fc696e3b, data reload: false

query1	1280	898	903	898
query2	6410	2062	2007	2007
query3	10770	4022	3934	3934
query4	64068	29750	23380	23380
query5	5101	470	458	458
query6	418	169	160	160
query7	5441	310	294	294
query8	326	234	220	220
query9	8406	2623	2612	2612
query10	459	279	266	266
query11	17525	15178	15807	15178
query12	170	97	107	97
query13	1457	426	415	415
query14	10328	7439	7046	7046
query15	205	177	179	177
query16	6890	507	473	473
query17	1212	610	583	583
query18	1720	301	314	301
query19	217	190	149	149
query20	128	124	112	112
query21	211	99	105	99
query22	4679	4608	4632	4608
query23	34629	33800	33704	33704
query24	6111	2901	2843	2843
query25	504	394	395	394
query26	633	158	160	158
query27	1618	275	295	275
query28	4342	2443	2428	2428
query29	686	429	427	427
query30	231	153	148	148
query31	959	796	802	796
query32	69	57	55	55
query33	434	290	292	290
query34	898	482	473	473
query35	833	726	723	723
query36	1062	928	944	928
query37	151	88	83	83
query38	4063	3947	3931	3931
query39	1584	1415	1400	1400
query40	199	95	94	94
query41	48	47	45	45
query42	122	95	93	93
query43	514	485	491	485
query44	1148	802	784	784
query45	193	162	163	162
query46	1127	752	755	752
query47	1915	1795	1826	1795
query48	470	381	357	357
query49	675	408	413	408
query50	838	397	401	397
query51	7124	6918	6944	6918
query52	97	82	87	82
query53	245	176	182	176
query54	562	454	458	454
query55	79	72	76	72
query56	282	261	277	261
query57	1214	1080	1096	1080
query58	218	228	241	228
query59	3267	3065	2792	2792
query60	286	258	253	253
query61	103	103	100	100
query62	741	673	657	657
query63	218	185	177	177
query64	1369	649	618	618
query65	3242	3159	3200	3159
query66	674	296	301	296
query67	15788	15477	15309	15309
query68	1300	852	830	830
query69	448	340	362	340
query70	1221	1185	1215	1185
query71	325	328	330	328
query72	6074	3455	3473	3455
query73	582	572	568	568
query74	9185	8971	8965	8965
query75	2956	2860	2933	2860
query76	1029	852	858	852
query77	438	353	363	353
query78	9475	9168	9308	9168
query79	901	884	862	862
query80	587	570	593	570
query81	463	248	250	248
query82	230	235	230	230
query83	159	156	153	153
query84	265	108	95	95
query85	666	362	359	359
query86	316	320	331	320
query87	4377	4367	4283	4283
query88	4332	4027	4015	4015
query89	368	371	356	356
query90	1378	316	309	309
query91	167	178	166	166
query92	72	76	74	74
query93	901	871	873	871
query94	554	361	368	361
query95	428	408	410	408
query96	479	477	485	477
query97	3115	3118	3168	3118
query98	224	229	231	229
query99	1415	1300	1297	1297
Total cold run time: 303416 ms
Total hot run time: 198232 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.49 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 72812ee7a05ece367c585f61bb0edad1fc696e3b, data reload: false

query1	0.04	0.05	0.04
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.64	0.10	0.10
query5	0.50	0.49	0.49
query6	1.13	0.73	0.73
query7	0.01	0.02	0.01
query8	0.04	0.03	0.03
query9	0.56	0.49	0.50
query10	0.55	0.54	0.53
query11	0.14	0.11	0.11
query12	0.14	0.11	0.11
query13	0.60	0.59	0.60
query14	3.11	2.99	3.09
query15	0.91	0.82	0.81
query16	0.39	0.38	0.38
query17	1.03	1.06	1.06
query18	0.19	0.19	0.20
query19	1.95	1.82	2.01
query20	0.02	0.01	0.01
query21	15.35	0.60	0.61
query22	2.64	2.42	1.67
query23	17.44	0.87	0.75
query24	2.99	0.96	1.65
query25	0.34	0.09	0.09
query26	0.52	0.14	0.14
query27	0.04	0.04	0.04
query28	10.12	1.12	1.06
query29	12.54	3.25	3.21
query30	0.24	0.06	0.06
query31	2.89	0.37	0.37
query32	3.29	0.46	0.46
query33	2.97	2.98	3.07
query34	17.01	4.35	4.39
query35	4.39	4.41	4.41
query36	0.66	0.48	0.48
query37	0.08	0.05	0.05
query38	0.04	0.03	0.04
query39	0.04	0.02	0.02
query40	0.14	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.02	0.03
Total cold run time: 107.11 s
Total hot run time: 32.49 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9587/25690)
Line Coverage: 28.69% (79227/276114)
Region Coverage: 28.17% (41022/145645)
Branch Coverage: 24.79% (20906/84338)
Coverage Report: http://coverage.selectdb-in.cc/coverage/72812ee7a05ece367c585f61bb0edad1fc696e3b_72812ee7a05ece367c585f61bb0edad1fc696e3b/report/index.html

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 19, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit b2a50a3 into apache:master Sep 19, 2024
24 of 28 checks passed
@suxiaogang223 suxiaogang223 deleted the some_presto_functions branch September 19, 2024 08:16
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Sep 20, 2024
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Sep 22, 2024
morningman pushed a commit that referenced this pull request Sep 23, 2024
## Proposed changes

pick #40567

some code about const folding should wait the pr picked:
#40441
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants