Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](Outfile) upgrade apache-arrow version to 13.0.0 #35142

Draft
wants to merge 3 commits into
base: branch-2.0
Choose a base branch
from

Conversation

BePPPower
Copy link
Contributor

@BePPPower BePPPower commented May 21, 2024

Proposed changes

Issue Number: close #xxx

When an invalid endpoint is specified for S3, exporting large data volumes using select outfile will cause a BE core dump.
This is due to this bug: apache/arrow#35520

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@BePPPower BePPPower changed the title [Fix](Outfile) upgrade apache-arrow version [Fix](Outfile) upgrade apache-arrow version to 13.0.0 May 21, 2024
@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 49206 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 466bddcde79354928d63f3bb8106d261b95d8db6, data reload: false

------ Round 1 ----------------------------------
q1	17785	4408	4332	4332
q2	2034	152	145	145
q3	10456	1908	1891	1891
q4	10332	1233	1336	1233
q5	8565	3905	3900	3900
q6	229	125	125	125
q7	2033	1602	1607	1602
q8	9291	2711	2690	2690
q9	10418	10249	10044	10044
q10	8655	3484	3506	3484
q11	432	246	236	236
q12	468	296	297	296
q13	18307	3931	4024	3931
q14	351	325	334	325
q15	506	470	465	465
q16	666	576	575	575
q17	1130	967	936	936
q18	7236	6798	6789	6789
q19	1715	1583	1515	1515
q20	559	296	306	296
q21	4553	4129	4027	4027
q22	498	369	389	369
Total cold run time: 116219 ms
Total hot run time: 49206 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4305	4302	4256	4256
q2	322	218	219	218
q3	4166	4123	4141	4123
q4	2748	2738	2734	2734
q5	7136	7078	7133	7078
q6	232	122	121	121
q7	3250	2806	2817	2806
q8	4305	4461	4444	4444
q9	16756	16714	16649	16649
q10	4234	4264	4245	4245
q11	755	679	701	679
q12	1021	869	848	848
q13	7126	3744	3737	3737
q14	445	420	416	416
q15	514	455	446	446
q16	725	666	674	666
q17	3829	3924	3854	3854
q18	8803	8763	8741	8741
q19	1717	1697	1653	1653
q20	2385	2164	2077	2077
q21	8456	8391	8480	8391
q22	1018	921	927	921
Total cold run time: 84248 ms
Total hot run time: 79103 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.79% (8076/21369)
Line Coverage: 29.45% (65940/223905)
Region Coverage: 28.92% (33945/117388)
Branch Coverage: 24.77% (17414/70316)
Coverage Report: http://coverage.selectdb-in.cc/coverage/466bddcde79354928d63f3bb8106d261b95d8db6_466bddcde79354928d63f3bb8106d261b95d8db6/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 203495 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 466bddcde79354928d63f3bb8106d261b95d8db6, data reload: false

query1	919	381	371	371
query2	6537	2777	2616	2616
query3	6914	211	207	207
query4	20155	17957	18010	17957
query5	19736	6532	6462	6462
query6	272	215	215	215
query7	4164	295	311	295
query8	261	239	231	231
query9	3120	2688	2641	2641
query10	425	308	303	303
query11	12052	10781	10682	10682
query12	118	72	73	72
query13	5585	687	663	663
query14	17470	13585	13556	13556
query15	359	228	231	228
query16	6477	282	254	254
query17	1721	1458	879	879
query18	2310	403	401	401
query19	210	146	145	145
query20	79	77	77	77
query21	188	101	93	93
query22	5314	5141	5148	5141
query23	32757	32214	32684	32214
query24	6981	6704	6651	6651
query25	563	430	432	430
query26	536	165	170	165
query27	1895	314	308	308
query28	6568	2372	2323	2323
query29	3015	2806	2763	2763
query30	242	163	167	163
query31	953	748	746	746
query32	71	62	61	61
query33	408	253	257	253
query34	999	494	500	494
query35	1694	961	945	945
query36	1425	1296	1079	1079
query37	91	63	61	61
query38	3036	2956	2926	2926
query39	1379	1321	1309	1309
query40	207	91	91	91
query41	43	36	35	35
query42	79	80	87	80
query43	712	665	658	658
query44	1154	719	726	719
query45	242	223	226	223
query46	1214	954	978	954
query47	1952	1825	1833	1825
query48	1031	691	695	691
query49	627	373	369	369
query50	878	643	605	605
query51	4686	4643	4648	4643
query52	91	87	83	83
query53	446	322	313	313
query54	2684	2448	2480	2448
query55	98	76	86	76
query56	226	206	201	201
query57	1209	1217	1107	1107
query58	212	201	186	186
query59	4180	3941	3792	3792
query60	205	199	212	199
query61	85	82	83	82
query62	851	524	494	494
query63	479	342	338	338
query64	2540	1540	1394	1394
query65	3625	3577	3614	3577
query66	807	370	380	370
query67	15374	15170	17102	15170
query68	8787	644	650	644
query69	568	349	370	349
query70	1581	1473	1394	1394
query71	412	303	313	303
query72	6494	3391	3391	3391
query73	733	326	316	316
query74	6346	5835	5820	5820
query75	5265	3631	3699	3631
query76	5166	1135	1127	1127
query77	764	248	258	248
query78	12630	11702	11570	11570
query79	8702	635	634	634
query80	1441	391	384	384
query81	496	235	233	233
query82	1649	99	98	98
query83	163	127	133	127
query84	260	70	69	69
query85	866	295	293	293
query86	324	282	293	282
query87	3228	3021	3020	3020
query88	4968	2336	2351	2336
query89	422	279	282	279
query90	2049	183	194	183
query91	169	136	130	130
query92	57	52	50	50
query93	6590	607	570	570
query94	734	204	198	198
query95	1073	1048	1054	1048
query96	648	337	326	326
query97	6517	6329	6410	6329
query98	195	171	178	171
query99	3009	942	839	839
Total cold run time: 315242 ms
Total hot run time: 203495 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 466bddcde79354928d63f3bb8106d261b95d8db6, data reload: false

query1	0.03	0.02	0.02
query2	0.07	0.03	0.03
query3	0.25	0.05	0.04
query4	1.79	0.06	0.06
query5	0.53	0.52	0.51
query6	1.24	0.60	0.66
query7	0.02	0.01	0.01
query8	0.03	0.03	0.02
query9	0.53	0.47	0.47
query10	0.53	0.53	0.52
query11	0.12	0.09	0.08
query12	0.11	0.09	0.08
query13	0.64	0.61	0.61
query14	0.80	0.79	0.79
query15	0.78	0.78	0.76
query16	0.39	0.38	0.39
query17	0.96	0.98	1.03
query18	0.22	0.27	0.24
query19	1.94	1.84	1.82
query20	0.01	0.01	0.01
query21	15.50	0.56	0.55
query22	2.04	2.15	1.58
query23	17.16	0.95	0.86
query24	6.32	1.23	1.22
query25	0.36	0.11	0.05
query26	0.72	0.15	0.16
query27	0.04	0.03	0.03
query28	6.54	0.75	0.75
query29	12.71	2.30	2.28
query30	0.60	0.49	0.52
query31	2.82	0.38	0.37
query32	3.41	0.50	0.50
query33	3.05	3.08	3.07
query34	15.26	4.78	4.79
query35	4.86	4.85	4.84
query36	1.06	1.01	1.02
query37	0.06	0.04	0.05
query38	0.04	0.02	0.02
query39	0.01	0.01	0.02
query40	0.16	0.14	0.14
query41	0.07	0.01	0.02
query42	0.02	0.01	0.01
query43	0.03	0.01	0.02
Total cold run time: 103.83 s
Total hot run time: 30.8 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 466bddcde79354928d63f3bb8106d261b95d8db6 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       21.4 seconds inserted 10000000 Rows, about 467K ops/s

@@ -399,10 +412,10 @@ BENCHMARK_MD5SUM="8ddf8571d3f6198d37852bcbd964f817"

# xsimd
# for arrow-7.0.0, if arrow upgrade, this version may also need to be changed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@xinyiZzz xinyiZzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BePPPower
Copy link
Contributor Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 22, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 49451 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ff07338349d02fe88658ce3a7e9ab429ea955b52, data reload: false

------ Round 1 ----------------------------------
q1	17580	4317	4336	4317
q2	2040	155	144	144
q3	10424	1912	1964	1912
q4	10317	1255	1353	1255
q5	8786	3863	3875	3863
q6	230	127	126	126
q7	2073	1592	1590	1590
q8	9311	2707	2709	2707
q9	10547	10305	10251	10251
q10	8644	3514	3453	3453
q11	419	241	255	241
q12	458	300	299	299
q13	18373	3935	4018	3935
q14	357	341	327	327
q15	510	464	454	454
q16	685	583	580	580
q17	1127	972	952	952
q18	7232	6984	6789	6789
q19	1715	1561	1530	1530
q20	548	303	304	303
q21	4464	4089	4041	4041
q22	496	410	382	382
Total cold run time: 116336 ms
Total hot run time: 49451 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4310	4289	4299	4289
q2	313	224	215	215
q3	4133	4127	4148	4127
q4	2761	2741	2740	2740
q5	7182	7120	7029	7029
q6	234	123	119	119
q7	3220	2843	2780	2780
q8	4279	4414	4474	4414
q9	16859	16829	16727	16727
q10	4276	4296	4216	4216
q11	765	671	704	671
q12	1032	873	851	851
q13	7147	3731	3751	3731
q14	458	423	426	423
q15	513	459	456	456
q16	747	704	687	687
q17	3850	3863	3822	3822
q18	8826	8664	8767	8664
q19	1691	1710	1628	1628
q20	2378	2136	2126	2126
q21	8486	8461	8351	8351
q22	1015	941	947	941
Total cold run time: 84475 ms
Total hot run time: 79007 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.79% (8076/21369)
Line Coverage: 29.45% (65944/223905)
Region Coverage: 28.92% (33948/117388)
Branch Coverage: 24.78% (17421/70316)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ff07338349d02fe88658ce3a7e9ab429ea955b52_ff07338349d02fe88658ce3a7e9ab429ea955b52/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 202347 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ff07338349d02fe88658ce3a7e9ab429ea955b52, data reload: false

query1	937	384	375	375
query2	6544	2697	2381	2381
query3	6916	205	210	205
query4	21374	18032	17887	17887
query5	19747	6555	6543	6543
query6	274	223	223	223
query7	4153	316	326	316
query8	250	247	229	229
query9	3151	2674	2639	2639
query10	400	300	287	287
query11	11302	10768	10793	10768
query12	123	78	81	78
query13	5581	685	679	679
query14	17939	13633	13475	13475
query15	358	221	238	221
query16	6470	271	254	254
query17	1762	1453	867	867
query18	2335	398	405	398
query19	203	149	151	149
query20	80	78	79	78
query21	191	93	97	93
query22	5341	5022	5034	5022
query23	32473	31914	31986	31914
query24	7013	6549	6582	6549
query25	517	437	405	405
query26	529	157	164	157
query27	1882	296	288	288
query28	6228	2347	2309	2309
query29	2966	2699	2872	2699
query30	241	164	162	162
query31	918	741	752	741
query32	70	62	57	57
query33	400	247	245	245
query34	854	464	470	464
query35	1126	918	894	894
query36	1281	1091	1158	1091
query37	94	62	63	62
query38	3056	2925	2927	2925
query39	1390	1317	1328	1317
query40	201	93	93	93
query41	37	42	35	35
query42	85	87	81	81
query43	719	720	713	713
query44	1140	711	715	711
query45	237	235	223	223
query46	1227	953	959	953
query47	1812	1917	1708	1708
query48	1006	724	699	699
query49	620	370	362	362
query50	880	612	610	610
query51	4752	4614	4613	4613
query52	95	81	82	81
query53	444	320	319	319
query54	2647	2430	2428	2428
query55	90	78	87	78
query56	228	233	209	209
query57	1170	1141	1076	1076
query58	219	202	200	200
query59	4302	4279	3864	3864
query60	216	213	196	196
query61	90	82	83	82
query62	866	455	470	455
query63	474	344	333	333
query64	2542	1477	1444	1444
query65	3615	3545	3705	3545
query66	820	433	387	387
query67	17299	15017	15411	15017
query68	8349	651	654	651
query69	550	333	349	333
query70	1796	1479	1298	1298
query71	398	306	324	306
query72	6526	3404	3409	3404
query73	739	328	317	317
query74	6352	5930	5855	5855
query75	4499	3780	3704	3704
query76	4685	1154	1216	1154
query77	562	258	250	250
query78	12590	11530	11901	11530
query79	8157	640	646	640
query80	1667	385	390	385
query81	513	231	234	231
query82	1481	99	93	93
query83	169	130	135	130
query84	262	69	69	69
query85	1265	295	294	294
query86	357	293	289	289
query87	3205	2968	2984	2968
query88	5240	2344	2329	2329
query89	371	292	286	286
query90	1809	210	220	210
query91	178	138	149	138
query92	61	53	52	52
query93	4695	559	606	559
query94	871	204	197	197
query95	1113	1043	1056	1043
query96	649	336	326	326
query97	6480	6289	6454	6289
query98	194	181	175	175
query99	2827	934	881	881
Total cold run time: 312912 ms
Total hot run time: 202347 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ff07338349d02fe88658ce3a7e9ab429ea955b52, data reload: false

query1	0.02	0.02	0.02
query2	0.07	0.02	0.01
query3	0.25	0.04	0.06
query4	1.77	0.08	0.11
query5	0.53	0.52	0.52
query6	1.25	0.61	0.61
query7	0.02	0.01	0.01
query8	0.03	0.02	0.02
query9	0.51	0.47	0.48
query10	0.53	0.54	0.54
query11	0.12	0.09	0.08
query12	0.12	0.09	0.09
query13	0.62	0.62	0.61
query14	0.80	0.78	0.78
query15	0.77	0.77	0.75
query16	0.36	0.36	0.36
query17	1.00	1.03	1.00
query18	0.20	0.26	0.24
query19	1.84	1.85	1.85
query20	0.02	0.01	0.01
query21	15.46	0.56	0.54
query22	2.16	2.23	1.52
query23	16.87	0.98	0.88
query24	6.12	1.65	0.91
query25	0.38	0.08	0.07
query26	0.73	0.15	0.14
query27	0.04	0.04	0.05
query28	6.46	0.76	0.75
query29	12.62	2.38	2.24
query30	0.60	0.54	0.51
query31	2.81	0.39	0.37
query32	3.38	0.50	0.51
query33	3.08	3.07	3.02
query34	15.27	4.79	4.79
query35	4.88	4.80	4.82
query36	1.07	1.02	1.02
query37	0.06	0.04	0.04
query38	0.04	0.02	0.02
query39	0.02	0.01	0.02
query40	0.16	0.15	0.14
query41	0.06	0.01	0.02
query42	0.02	0.02	0.01
query43	0.02	0.02	0.02
Total cold run time: 103.14 s
Total hot run time: 30.44 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit ff07338349d02fe88658ce3a7e9ab429ea955b52 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       21.5 seconds inserted 10000000 Rows, about 465K ops/s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaokang
Copy link
Contributor

@BePPPower please all upgrade changes in desc and change title

@xiaokang xiaokang marked this pull request as draft May 27, 2024 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants