Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](trino-connector) fix hive split info of trino-connector catalog #32615

Merged
merged 2 commits into from
Mar 29, 2024

Conversation

BePPPower
Copy link
Contributor

@BePPPower BePPPower commented Mar 21, 2024

Proposed changes

Issue Number: close #xxx

Before, the ScanNode of trino-Connector did not display the split information.

Now get the corresponding split information through the trino connector API

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@BePPPower BePPPower marked this pull request as ready for review March 25, 2024 01:48
@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37667 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a801809c682ebd087751e22598c982dc5662ef37, data reload: false

------ Round 1 ----------------------------------
q1	17628	4170	4046	4046
q2	2102	150	146	146
q3	10609	1134	1178	1134
q4	10235	730	810	730
q5	7439	3021	2971	2971
q6	203	123	120	120
q7	1029	582	567	567
q8	9343	2004	1994	1994
q9	7193	6556	6537	6537
q10	8452	3431	3580	3431
q11	438	229	212	212
q12	414	194	197	194
q13	17790	2843	2863	2843
q14	222	193	205	193
q15	521	452	473	452
q16	502	372	374	372
q17	950	537	613	537
q18	7150	6431	6439	6431
q19	1537	1461	1350	1350
q20	542	244	252	244
q21	3630	2880	2958	2880
q22	347	288	283	283
Total cold run time: 108276 ms
Total hot run time: 37667 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4122	4070	4095	4070
q2	324	236	233	233
q3	2929	2810	2777	2777
q4	1812	1514	1525	1514
q5	5305	5302	5317	5302
q6	196	117	118	117
q7	2235	1849	1871	1849
q8	3149	3301	3258	3258
q9	8676	8628	8692	8628
q10	3785	3806	3763	3763
q11	534	441	443	441
q12	714	540	553	540
q13	16911	2863	2877	2863
q14	276	247	255	247
q15	502	454	464	454
q16	460	407	422	407
q17	1737	1480	1495	1480
q18	7374	7166	7128	7128
q19	1612	1477	1482	1477
q20	1904	1739	1720	1720
q21	4847	4724	4772	4724
q22	514	457	439	439
Total cold run time: 69918 ms
Total hot run time: 53431 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186658 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a801809c682ebd087751e22598c982dc5662ef37, data reload: false

query1	944	365	340	340
query2	7363	2129	2002	2002
query3	6710	219	216	216
query4	31454	21422	21286	21286
query5	4322	413	495	413
query6	277	185	175	175
query7	4623	292	294	292
query8	229	174	171	171
query9	9484	2253	2250	2250
query10	562	259	258	258
query11	14790	14447	14410	14410
query12	138	88	86	86
query13	1632	419	417	417
query14	11649	11344	11657	11344
query15	294	197	200	197
query16	8231	265	264	264
query17	1974	544	541	541
query18	2106	275	271	271
query19	343	145	156	145
query20	95	92	91	91
query21	208	128	128	128
query22	5002	4802	4803	4802
query23	33510	32760	32562	32562
query24	10689	2882	2823	2823
query25	585	378	359	359
query26	781	158	159	158
query27	2221	354	351	351
query28	5828	1834	1864	1834
query29	846	642	593	593
query30	301	151	147	147
query31	1002	736	729	729
query32	89	57	53	53
query33	767	261	254	254
query34	1034	496	499	496
query35	821	611	614	611
query36	1011	883	870	870
query37	113	75	74	74
query38	3531	3445	3464	3445
query39	1443	1424	1424	1424
query40	211	112	108	108
query41	48	43	44	43
query42	101	94	96	94
query43	472	455	466	455
query44	1211	721	705	705
query45	265	248	264	248
query46	1116	685	689	685
query47	1914	1827	1861	1827
query48	442	349	356	349
query49	1090	334	340	334
query50	780	380	379	379
query51	6693	6618	6558	6558
query52	103	102	90	90
query53	348	283	285	283
query54	333	246	245	245
query55	84	82	80	80
query56	258	256	242	242
query57	1211	1138	1143	1138
query58	239	208	215	208
query59	2736	2605	2469	2469
query60	280	260	254	254
query61	114	115	113	113
query62	674	473	435	435
query63	310	288	276	276
query64	5703	4111	4111	4111
query65	3083	3035	3039	3035
query66	892	366	372	366
query67	15136	14727	14728	14727
query68	6567	516	522	516
query69	630	380	383	380
query70	1262	1193	1138	1138
query71	486	285	292	285
query72	7443	2880	2656	2656
query73	725	318	317	317
query74	8123	6733	6640	6640
query75	3919	2853	2875	2853
query76	4428	871	917	871
query77	607	264	267	264
query78	10864	10043	10131	10043
query79	8240	535	522	522
query80	1989	414	394	394
query81	542	217	218	217
query82	1622	204	210	204
query83	320	148	152	148
query84	288	84	76	76
query85	1523	317	312	312
query86	492	311	290	290
query87	3760	3518	3520	3518
query88	4898	2287	2295	2287
query89	511	366	372	366
query90	1956	180	179	179
query91	195	136	138	136
query92	58	49	48	48
query93	6290	495	476	476
query94	1191	177	178	177
query95	436	347	332	332
query96	594	261	275	261
query97	3078	2890	2898	2890
query98	240	215	201	201
query99	1194	937	922	922
Total cold run time: 307320 ms
Total hot run time: 186658 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit a801809c682ebd087751e22598c982dc5662ef37 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       22.2 seconds inserted 10000000 Rows, about 450K ops/s

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38241 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a801809c682ebd087751e22598c982dc5662ef37, data reload: false

------ Round 1 ----------------------------------
q1	17671	4498	4213	4213
q2	2578	167	172	167
q3	11475	1131	1249	1131
q4	10650	786	702	702
q5	7606	3018	3051	3018
q6	207	131	122	122
q7	1073	618	607	607
q8	9489	2045	2015	2015
q9	7443	6777	6741	6741
q10	8542	3479	3582	3479
q11	437	224	217	217
q12	372	200	199	199
q13	17811	2869	2852	2852
q14	236	216	198	198
q15	510	473	453	453
q16	486	376	374	374
q17	966	578	594	578
q18	7273	6612	6414	6414
q19	2587	1425	1431	1425
q20	549	255	247	247
q21	3549	2802	2949	2802
q22	341	287	303	287
Total cold run time: 111851 ms
Total hot run time: 38241 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4128	4071	4079	4071
q2	339	232	224	224
q3	2993	2856	2804	2804
q4	1846	1553	1576	1553
q5	5288	5329	5375	5329
q6	190	114	117	114
q7	2233	1873	1860	1860
q8	3141	3280	3277	3277
q9	8716	8719	8737	8719
q10	3804	3757	3788	3757
q11	535	433	448	433
q12	738	566	557	557
q13	16932	2864	2848	2848
q14	278	241	260	241
q15	489	453	449	449
q16	492	411	430	411
q17	1751	1487	1474	1474
q18	7498	7269	7104	7104
q19	1619	1505	1538	1505
q20	1931	1721	1713	1713
q21	4958	4705	4760	4705
q22	538	459	434	434
Total cold run time: 70437 ms
Total hot run time: 53582 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185624 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a801809c682ebd087751e22598c982dc5662ef37, data reload: false

query1	929	380	346	346
query2	7430	2010	1906	1906
query3	6711	208	216	208
query4	31918	21168	21363	21168
query5	4337	404	423	404
query6	280	181	179	179
query7	4621	291	289	289
query8	234	188	192	188
query9	9221	2273	2289	2273
query10	573	245	260	245
query11	17542	14449	14421	14421
query12	134	84	84	84
query13	1623	404	428	404
query14	13756	11035	10816	10816
query15	300	195	184	184
query16	8225	263	251	251
query17	1986	585	549	549
query18	2114	293	280	280
query19	361	159	152	152
query20	95	87	102	87
query21	200	136	130	130
query22	5001	4837	4814	4814
query23	33493	32876	32916	32876
query24	11132	2768	2860	2768
query25	665	396	383	383
query26	1591	158	156	156
query27	2982	349	355	349
query28	7592	1889	1845	1845
query29	961	661	619	619
query30	304	154	154	154
query31	977	738	744	738
query32	95	62	56	56
query33	779	261	296	261
query34	953	473	469	469
query35	843	601	628	601
query36	1039	876	855	855
query37	118	74	73	73
query38	3615	3482	3399	3399
query39	1459	1436	1433	1433
query40	247	108	107	107
query41	51	45	45	45
query42	109	93	92	92
query43	467	440	432	432
query44	1224	719	711	711
query45	281	243	260	243
query46	1102	702	710	702
query47	1918	1869	1863	1863
query48	449	369	340	340
query49	1166	320	341	320
query50	756	367	362	362
query51	6655	6565	6622	6565
query52	114	91	91	91
query53	343	268	268	268
query54	291	249	234	234
query55	85	73	78	73
query56	233	216	224	216
query57	1213	1137	1153	1137
query58	229	198	210	198
query59	2813	2452	2553	2452
query60	269	237	252	237
query61	94	95	93	93
query62	668	468	448	448
query63	297	273	270	270
query64	6508	4012	3818	3818
query65	3108	3047	3065	3047
query66	1450	384	374	374
query67	15096	14823	14689	14689
query68	5934	512	505	505
query69	574	372	372	372
query70	1255	1101	1137	1101
query71	452	286	289	286
query72	6612	2863	2678	2678
query73	717	315	321	315
query74	7174	6649	6661	6649
query75	3760	2825	2861	2825
query76	3934	973	895	895
query77	639	285	260	260
query78	10869	10161	10072	10072
query79	8703	507	507	507
query80	1829	398	393	393
query81	559	210	218	210
query82	1588	213	195	195
query83	325	141	147	141
query84	292	80	81	80
query85	1650	312	311	311
query86	490	290	295	290
query87	3719	3500	3530	3500
query88	5276	2286	2266	2266
query89	534	364	366	364
query90	1948	173	174	173
query91	187	137	139	137
query92	56	46	46	46
query93	7126	495	476	476
query94	1218	175	172	172
query95	424	322	319	319
query96	610	267	280	267
query97	3068	2883	2902	2883
query98	229	218	209	209
query99	1237	920	887	887
Total cold run time: 316413 ms
Total hot run time: 185624 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit a801809c682ebd087751e22598c982dc5662ef37 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       22.2 seconds inserted 10000000 Rows, about 450K ops/s

initHiveSplitInfo();
break;
default:
LOG.warn("Unknow connector name: " + connectorName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to debug.
Because if user use other connector, there maybe lots of logs

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37427 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit eff0d28a3c66b987fc488ee49c7ad8a54518dc8a, data reload: false

------ Round 1 ----------------------------------
q1	17632	4153	4081	4081
q2	2116	158	154	154
q3	10585	1128	1182	1128
q4	10222	710	776	710
q5	7453	2911	2869	2869
q6	201	123	123	123
q7	1027	592	566	566
q8	9323	1964	1995	1964
q9	6874	6326	6280	6280
q10	8485	3488	3561	3488
q11	440	225	225	225
q12	426	200	195	195
q13	17810	2851	2904	2851
q14	230	210	215	210
q15	520	472	458	458
q16	491	376	364	364
q17	935	546	542	542
q18	7055	6446	6450	6446
q19	4122	1428	1470	1428
q20	543	263	273	263
q21	3595	2968	2792	2792
q22	337	296	290	290
Total cold run time: 110422 ms
Total hot run time: 37427 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4105	4053	4067	4053
q2	333	240	237	237
q3	2928	2826	2886	2826
q4	1802	1582	1546	1546
q5	5196	5214	5231	5214
q6	195	116	119	116
q7	2220	1787	1819	1787
q8	3148	3282	3249	3249
q9	8402	8384	8434	8384
q10	3789	3815	3954	3815
q11	565	469	479	469
q12	765	602	614	602
q13	17802	3149	3156	3149
q14	297	284	283	283
q15	547	496	501	496
q16	511	473	475	473
q17	1813	1557	1542	1542
q18	8073	7594	7481	7481
q19	1691	1598	1544	1544
q20	2010	1815	1847	1815
q21	4975	4900	4815	4815
q22	555	442	452	442
Total cold run time: 71722 ms
Total hot run time: 54338 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182919 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit eff0d28a3c66b987fc488ee49c7ad8a54518dc8a, data reload: false

query1	917	349	347	347
query2	6419	2102	1962	1962
query3	6706	205	203	203
query4	31589	21260	21426	21260
query5	4271	390	383	383
query6	260	198	163	163
query7	4616	292	287	287
query8	232	170	165	165
query9	9194	2280	2269	2269
query10	447	253	261	253
query11	14719	14272	14184	14184
query12	138	87	88	87
query13	1626	419	423	419
query14	11135	7908	7968	7908
query15	295	184	185	184
query16	8233	267	264	264
query17	2040	583	564	564
query18	2114	294	288	288
query19	355	163	166	163
query20	94	93	89	89
query21	200	131	127	127
query22	4951	4803	4742	4742
query23	33457	32918	33125	32918
query24	11050	3014	2946	2946
query25	619	400	405	400
query26	1197	168	163	163
query27	3024	359	364	359
query28	7758	1927	1911	1911
query29	894	646	656	646
query30	317	161	158	158
query31	969	780	733	733
query32	92	54	60	54
query33	807	263	254	254
query34	1254	505	518	505
query35	877	746	740	740
query36	1067	937	914	914
query37	130	65	68	65
query38	3793	3742	3697	3697
query39	1694	1621	1658	1621
query40	176	105	111	105
query41	50	47	46	46
query42	106	98	103	98
query43	504	461	459	459
query44	1202	735	762	735
query45	298	274	252	252
query46	1130	731	715	715
query47	2060	1938	1890	1890
query48	459	358	367	358
query49	887	335	339	335
query50	801	384	390	384
query51	6839	6828	6842	6828
query52	115	89	96	89
query53	346	289	287	287
query54	301	274	235	235
query55	84	84	79	79
query56	267	228	229	228
query57	1308	1163	1172	1163
query58	229	209	209	209
query59	3025	2698	2657	2657
query60	260	236	234	234
query61	93	86	90	86
query62	608	460	433	433
query63	300	272	270	270
query64	5019	4176	4093	4093
query65	3064	3022	3051	3022
query66	822	392	370	370
query67	15567	14758	14859	14758
query68	7190	523	528	523
query69	626	379	381	379
query70	1276	1160	1098	1098
query71	515	269	272	269
query72	6502	2524	2334	2334
query73	730	314	318	314
query74	8269	6437	6485	6437
query75	3504	2251	2211	2211
query76	4965	837	860	837
query77	603	254	257	254
query78	11054	10167	10157	10157
query79	13059	535	519	519
query80	2286	366	367	366
query81	544	213	216	213
query82	841	87	87	87
query83	200	138	148	138
query84	289	77	81	77
query85	1521	325	305	305
query86	459	316	315	315
query87	3719	3528	3524	3524
query88	5474	2311	2291	2291
query89	514	362	380	362
query90	1978	172	171	171
query91	168	142	133	133
query92	60	46	46	46
query93	7379	489	492	489
query94	1133	176	172	172
query95	401	312	306	306
query96	617	265	267	265
query97	2652	2472	2483	2472
query98	231	208	228	208
query99	1150	834	841	834
Total cold run time: 315379 ms
Total hot run time: 182919 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit eff0d28a3c66b987fc488ee49c7ad8a54518dc8a with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       14.0 seconds inserted 10000000 Rows, about 714K ops/s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 28, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 473a326 into apache:master Mar 29, 2024
27 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants