Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement] add a lower bound for bytes in scanner queue #28912

Closed
wants to merge 10 commits into from

Conversation

dataroaring
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@dataroaring
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.53% (8539/23373)
Line Coverage: 28.61% (69404/242584)
Region Coverage: 27.62% (35906/129980)
Branch Coverage: 24.36% (18355/75360)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6e2f7e705123397c3fed6753abb68901dfb4c835_6e2f7e705123397c3fed6753abb68901dfb4c835/report/index.html

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@dataroaring
Copy link
Contributor Author

run buildall

@dataroaring
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.59% (8549/23364)
Line Coverage: 28.66% (69510/242505)
Region Coverage: 27.66% (35945/129952)
Branch Coverage: 24.40% (18376/75320)
Coverage Report: http://coverage.selectdb-in.cc/coverage/bcf21a69ffe78a7c2454e2c7ed078fe27b3de9b9_bcf21a69ffe78a7c2454e2c7ed078fe27b3de9b9/report/index.html

@dataroaring
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.59% (8548/23364)
Line Coverage: 28.66% (69502/242505)
Region Coverage: 27.66% (35940/129952)
Branch Coverage: 24.39% (18371/75320)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d06f492f37e2aa3aaffb3dd8fc7abfb745671b48_d06f492f37e2aa3aaffb3dd8fc7abfb745671b48/report/index.html

@dataroaring
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -16,6 +16,7 @@
// under the License.

#pragma once
#include <bvar/bvar.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'bvar/bvar.h' file not found [clang-diagnostic-error]

#include <bvar/bvar.h>
         ^

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.60% (8551/23364)
Line Coverage: 28.68% (69548/242505)
Region Coverage: 27.67% (35963/129952)
Branch Coverage: 24.41% (18383/75320)
Coverage Report: http://coverage.selectdb-in.cc/coverage/c7ea8d4a861012c14faed9615c9b77f9eb2582eb_c7ea8d4a861012c14faed9615c9b77f9eb2582eb/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit c7ea8d4a861012c14faed9615c9b77f9eb2582eb, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4760	4321	4471	4321
q2	374	178	158	158
q3	1425	1182	1207	1182
q4	1073	862	760	760
q5	3215	3121	3185	3121
q6	228	130	136	130
q7	948	466	474	466
q8	2145	2213	2177	2177
q9	6608	6571	6555	6555
q10	3192	3115	3077	3077
q11	297	180	182	180
q12	358	205	205	205
q13	4533	3766	3771	3766
q14	234	211	211	211
q15	550	511	508	508
q16	432	405	409	405
q17	995	682	529	529
q18	6390	6030	6382	6030
q19	1560	1374	1437	1374
q20	521	335	307	307
q21	2935	2505	2465	2465
q22	346	274	284	274
Total cold run time: 43119 ms
Total hot run time: 38201 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4323	4353	4217	4217
q2	303	228	230	228
q3	3224	2995	3018	2995
q4	2100	1902	1889	1889
q5	5291	5250	5253	5250
q6	240	122	122	122
q7	2209	1777	1807	1777
q8	3317	3426	3404	3404
q9	8584	8565	8468	8468
q10	3866	3765	3785	3765
q11	545	433	422	422
q12	743	633	581	581
q13	4319	3596	3521	3521
q14	286	266	264	264
q15	560	513	509	509
q16	524	501	486	486
q17	1852	1666	1674	1666
q18	7758	7404	8002	7404
q19	1785	1729	1714	1714
q20	2257	1981	1971	1971
q21	5179	5172	5002	5002
q22	550	431	439	431
Total cold run time: 59815 ms
Total hot run time: 56086 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.68 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17183644549 Bytes

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@dataroaring
Copy link
Contributor Author

run buildall

@dataroaring
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.57% (8552/23386)
Line Coverage: 28.65% (69550/242781)
Region Coverage: 27.64% (35961/130088)
Branch Coverage: 24.38% (18383/75408)
Coverage Report: http://coverage.selectdb-in.cc/coverage/636a1f9b3f1fb5f8ad30d7be779587cddfcaae98_636a1f9b3f1fb5f8ad30d7be779587cddfcaae98/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 636a1f9b3f1fb5f8ad30d7be779587cddfcaae98, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4606	4283	4343	4283
q2	380	174	158	158
q3	1437	1247	1198	1198
q4	1075	820	809	809
q5	3220	3234	3137	3137
q6	234	134	131	131
q7	945	467	471	467
q8	2209	2227	2189	2189
q9	6645	6577	6623	6577
q10	3162	3140	3120	3120
q11	304	185	189	185
q12	365	218	218	218
q13	4546	3758	3755	3755
q14	238	202	215	202
q15	552	512	514	512
q16	424	417	388	388
q17	1034	606	529	529
q18	6388	6129	6525	6129
q19	1556	1358	1468	1358
q20	509	315	319	315
q21	3004	2543	2567	2543
q22	350	294	281	281
Total cold run time: 43183 ms
Total hot run time: 38484 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4300	4277	4341	4277
q2	324	219	229	219
q3	3228	3082	3023	3023
q4	2087	1912	1907	1907
q5	5228	5284	5237	5237
q6	234	124	124	124
q7	2232	1834	1852	1834
q8	3331	3442	3430	3430
q9	8611	8604	8462	8462
q10	3861	3800	3775	3775
q11	528	422	414	414
q12	743	607	594	594
q13	4315	3549	3547	3547
q14	291	268	270	268
q15	570	510	505	505
q16	507	490	498	490
q17	1847	1692	1639	1639
q18	7587	7590	8609	7590
q19	1830	1714	1729	1714
q20	2211	1981	1964	1964
q21	5211	5034	5089	5034
q22	544	451	466	451
Total cold run time: 59620 ms
Total hot run time: 56498 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.8 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17183844499 Bytes

@dataroaring
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 8a775aef7129fe8d7e6aacfb9c2479e984f9c8b6, data reload: false

------ Round 1 ----------------------------------
q1	17654	5557	4869	4869
q2	2035	159	142	142
q3	10585	1118	1136	1118
q4	10214	799	760	760
q5	7778	2901	2829	2829
q6	222	134	133	133
q7	946	561	505	505
q8	9286	2041	2033	2033
q9	6795	6383	6368	6368
q10	8194	2972	2982	2972
q11	418	233	220	220
q12	388	234	233	233
q13	18004	3578	3637	3578
q14	243	215	215	215
q15	540	503	505	503
q16	454	388	405	388
q17	979	494	482	482
q18	6867	6224	6000	6000
q19	1607	1445	1399	1399
q20	685	356	328	328
q21	2747	2334	2425	2334
q22	355	319	330	319
Total cold run time: 106996 ms
Total hot run time: 37728 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4866	4732	4811	4732
q2	350	239	258	239
q3	3041	2779	2720	2720
q4	1826	1633	1647	1633
q5	5271	5330	5234	5234
q6	215	123	127	123
q7	2147	1885	1795	1795
q8	3311	3408	3405	3405
q9	8418	8439	8381	8381
q10	3646	3456	3534	3456
q11	598	488	483	483
q12	773	673	678	673
q13	16875	3177	3186	3177
q14	292	253	283	253
q15	557	499	505	499
q16	552	499	502	499
q17	1974	1808	1752	1752
q18	8568	8260	10178	8260
q19	10930	1638	1602	1602
q20	2187	1937	1917	1917
q21	7492	4517	4741	4517
q22	586	476	488	476
Total cold run time: 84475 ms
Total hot run time: 55826 ms

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 8a775aef7129fe8d7e6aacfb9c2479e984f9c8b6, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5086	4856	4740	4740
q2	395	171	159	159
q3	1468	1193	1105	1105
q4	1074	792	798	792
q5	3032	2854	2941	2854
q6	227	141	134	134
q7	991	496	504	496
q8	2168	2220	2232	2220
q9	6600	6531	6589	6531
q10	3179	3039	3053	3039
q11	341	216	215	215
q12	380	231	235	231
q13	4298	3619	3581	3581
q14	252	218	215	215
q15	567	521	518	518
q16	464	445	382	382
q17	1034	587	590	587
q18	6541	6258	6078	6078
q19	1681	1475	1468	1468
q20	703	341	348	341
q21	2868	2358	2344	2344
q22	403	325	332	325
Total cold run time: 43752 ms
Total hot run time: 38355 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4848	4865	4782	4782
q2	341	254	232	232
q3	3081	2837	2792	2792
q4	1839	1626	1602	1602
q5	5428	5490	5382	5382
q6	224	124	128	124
q7	2197	1817	1804	1804
q8	3497	3557	3617	3557
q9	8559	8551	8559	8551
q10	3725	3571	3516	3516
q11	611	501	509	501
q12	757	666	599	599
q13	3777	3208	3203	3203
q14	290	256	269	256
q15	573	512	510	510
q16	569	520	489	489
q17	2076	1819	1874	1819
q18	8706	8395	12186	8395
q19	20353	1829	1708	1708
q20	2819	1978	1943	1943
q21	12997	4658	4693	4658
q22	892	507	498	498
Total cold run time: 88159 ms
Total hot run time: 56921 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit 8a775aef7129fe8d7e6aacfb9c2479e984f9c8b6, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	924	364	350	350
query2	6462	2010	1922	1922
query3	6645	209	209	209
query4	29188	22298	22271	22271
query5	3882	538	532	532
query6	265	190	176	176
query7	4594	277	273	273
query8	250	220	220	220
query9	8278	2292	2259	2259
query10	435	260	252	252
query11	16330	15718	15640	15640
query12	125	76	77	76
query13	1612	334	319	319
query14	9960	6582	6662	6582
query15	212	192	182	182
query16	6225	274	264	264
query17	1820	498	492	492
query18	1808	279	260	260
query19	171	140	139	139
query20	83	80	76	76
query21	181	101	95	95
query22	4895	4539	4565	4539
query23	32103	30783	30502	30502
query24	7685	2780	2721	2721
query25	584	344	348	344
query26	859	149	148	148
query27	2621	270	284	270
query28	5448	1916	1900	1900
query29	842	411	387	387
query30	284	147	152	147
query31	947	741	746	741
query32	86	57	58	57
query33	502	281	253	253
query34	857	464	458	458
query35	872	820	825	820
query36	1403	1359	1278	1278
query37	107	66	73	66
query38	3270	3249	3159	3159
query39	1332	1279	1290	1279
query40	186	103	88	88
query41	39	36	35	35
query42	91	83	89	83
query43	564	533	513	513
query44	1059	716	712	712
query45	195	183	188	183
query46	1070	654	630	630
query47	1646	1508	1531	1508
query48	345	252	255	252
query49	1079	323	315	315
query50	811	337	329	329
query51	5418	5303	5316	5303
query52	91	88	74	74
query53	228	156	146	146
query54	987	563	557	557
query55	98	87	98	87
query56	194	193	192	192
query57	1092	936	937	936
query58	229	200	208	200
query59	2550	2490	2473	2473
query60	259	231	228	228
query61	86	86	85	85
query62	674	463	449	449
query63	175	163	145	145
query64	4214	1806	1717	1717
query65	3313	3229	3224	3224
query66	1140	345	348	345
query67	15722	15118	15111	15111
query68	10963	544	550	544
query69	503	254	259	254
query70	1877	1578	1624	1578
query71	490	216	214	214
query72	5403	3569	3579	3569
query73	2446	313	312	312
query74	6799	6311	6251	6251
query75	5023	2338	2293	2293
query76	6407	978	1097	978
query77	701	249	255	249
query78	9212	8889	8587	8587
query79	3390	519	522	519
query80	770	360	360	360
query81	477	213	212	212
query82	208	105	97	97
query83	165	141	144	141
query84	246	54	53	53
query85	989	275	276	275
query86	410	395	381	381
query87	3423	3283	3216	3216
query88	3412	2287	2278	2278
query89	362	254	262	254
query90	1946	196	203	196
query91	118	90	93	90
query92	63	52	58	52
query93	3533	508	434	434
query94	884	187	185	185
query95	483	427	420	420
query96	665	320	332	320
query97	4225	4110	4124	4110
query98	222	207	194	194
query99	1071	810	865	810
Total cold run time: 284588 ms
Total hot run time: 177004 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.21 seconds
stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.4 seconds inserted 10000000 Rows, about 352K ops/s
storage size: 17188300213 Bytes

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.64% (8616/23518)
Line Coverage: 28.68% (70022/244154)
Region Coverage: 27.66% (36249/131041)
Branch Coverage: 24.36% (18512/76002)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8a775aef7129fe8d7e6aacfb9c2479e984f9c8b6_8a775aef7129fe8d7e6aacfb9c2479e984f9c8b6/report/index.html

@dataroaring dataroaring closed this Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants