Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](workloadgroup) use slot num to control memory distribution among one workload group #38237

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

yiguolei
Copy link
Contributor

@yiguolei yiguolei commented Jul 23, 2024

Proposed changes

  1. use wg_mem_limit / total slots count as default query mem limit.
  2. user could specify query_slot_count to adjust the mem limit for specific query.
  3. user could also specify enable_query_hard_limit to indicate use hard limit for specific query. But we do not expect user could modify this.
  4. total slot number == max_concurrency refers https://docs.aws.amazon.com/redshift/latest/dg/r_wlm_query_slot_count.html

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@yiguolei
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39669 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 162494f503be010ea2e2a5632f7f030460800d4b, data reload: false

------ Round 1 ----------------------------------
q1	18631	4462	4275	4275
q2	2024	190	182	182
q3	10536	1218	1048	1048
q4	10247	777	758	758
q5	7631	2737	2705	2705
q6	220	140	136	136
q7	979	599	586	586
q8	9213	2142	2096	2096
q9	8772	6579	6578	6578
q10	8675	3769	3781	3769
q11	449	237	234	234
q12	387	218	218	218
q13	18825	2985	2981	2981
q14	277	240	247	240
q15	531	479	477	477
q16	503	395	374	374
q17	983	665	629	629
q18	8238	7454	7394	7394
q19	7241	1357	1259	1259
q20	710	327	303	303
q21	4936	3174	3143	3143
q22	349	284	288	284
Total cold run time: 120357 ms
Total hot run time: 39669 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4393	4240	4219	4219
q2	381	265	258	258
q3	3053	2903	2965	2903
q4	2001	1766	1725	1725
q5	5540	5561	5458	5458
q6	230	133	136	133
q7	2195	1818	1854	1818
q8	3301	3477	3410	3410
q9	8795	8866	8797	8797
q10	4135	3781	3904	3781
q11	623	477	486	477
q12	795	632	640	632
q13	16971	3178	3222	3178
q14	317	311	293	293
q15	529	479	489	479
q16	504	431	435	431
q17	1832	1528	1486	1486
q18	8194	7850	7897	7850
q19	1765	1657	1525	1525
q20	2164	1877	1863	1863
q21	5202	4757	4778	4757
q22	600	513	515	513
Total cold run time: 73520 ms
Total hot run time: 55986 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 175387 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 162494f503be010ea2e2a5632f7f030460800d4b, data reload: false

query1	918	380	379	379
query2	6441	1874	1716	1716
query3	6625	219	221	219
query4	28392	17749	17372	17372
query5	3571	507	518	507
query6	271	167	178	167
query7	4581	300	285	285
query8	239	197	200	197
query9	8489	2463	2397	2397
query10	440	295	277	277
query11	10720	10000	9981	9981
query12	120	88	82	82
query13	1655	380	383	380
query14	10290	8753	7948	7948
query15	232	171	166	166
query16	7182	451	482	451
query17	1589	596	533	533
query18	1639	291	283	283
query19	201	146	153	146
query20	90	90	84	84
query21	204	142	132	132
query22	4297	3971	3890	3890
query23	34192	33847	33602	33602
query24	10965	3006	2883	2883
query25	607	395	395	395
query26	1036	158	153	153
query27	2688	284	290	284
query28	7375	2089	2091	2089
query29	883	659	679	659
query30	255	159	152	152
query31	962	777	788	777
query32	97	53	57	53
query33	779	358	336	336
query34	903	501	534	501
query35	876	740	778	740
query36	1133	998	970	970
query37	150	85	82	82
query38	2992	2864	2813	2813
query39	915	874	826	826
query40	207	126	127	126
query41	54	47	49	47
query42	121	100	111	100
query43	477	472	474	472
query44	1207	740	756	740
query45	197	164	165	164
query46	1084	730	737	730
query47	1882	1784	1781	1781
query48	390	296	298	296
query49	864	430	449	430
query50	785	402	405	402
query51	6796	6726	6684	6684
query52	118	94	98	94
query53	359	305	300	300
query54	897	470	464	464
query55	76	78	78	78
query56	315	294	297	294
query57	1132	1042	1049	1042
query58	270	264	277	264
query59	2932	2526	2678	2526
query60	338	301	298	298
query61	121	117	114	114
query62	796	663	664	663
query63	321	304	297	297
query64	9671	2342	1791	1791
query65	3158	3101	3106	3101
query66	766	346	362	346
query67	15560	15177	14994	14994
query68	5987	567	560	560
query69	717	482	377	377
query70	1128	1097	1103	1097
query71	483	292	292	292
query72	9109	6020	5614	5614
query73	784	331	335	331
query74	6094	5749	5677	5677
query75	4065	2712	2713	2712
query76	3782	944	923	923
query77	680	314	307	307
query78	10373	10600	9643	9643
query79	10438	548	536	536
query80	1771	475	483	475
query81	595	224	218	218
query82	499	132	137	132
query83	297	169	169	169
query84	281	89	85	85
query85	724	320	304	304
query86	467	302	298	298
query87	3291	3097	3143	3097
query88	5339	2399	2432	2399
query89	483	375	388	375
query90	2017	198	194	194
query91	130	102	102	102
query92	69	49	49	49
query93	3073	529	536	529
query94	1281	298	289	289
query95	408	316	321	316
query96	610	269	279	269
query97	3226	3009	3024	3009
query98	235	208	193	193
query99	1545	1271	1285	1271
Total cold run time: 295869 ms
Total hot run time: 175387 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 162494f503be010ea2e2a5632f7f030460800d4b, data reload: false

query1	0.04	0.04	0.03
query2	0.09	0.04	0.04
query3	0.23	0.05	0.06
query4	1.67	0.08	0.08
query5	0.50	0.48	0.47
query6	1.14	0.72	0.72
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.55	0.48	0.51
query10	0.55	0.54	0.54
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.58	0.59	0.59
query14	0.77	0.78	0.77
query15	0.86	0.82	0.82
query16	0.36	0.35	0.37
query17	1.04	1.01	0.97
query18	0.22	0.21	0.22
query19	1.81	1.70	1.72
query20	0.01	0.01	0.01
query21	15.42	0.75	0.64
query22	4.15	6.48	2.65
query23	18.35	1.42	1.31
query24	2.10	0.24	0.22
query25	0.15	0.09	0.09
query26	0.30	0.22	0.21
query27	0.45	0.23	0.24
query28	13.24	1.02	1.01
query29	12.63	3.32	3.30
query30	0.25	0.06	0.06
query31	2.87	0.39	0.39
query32	3.29	0.46	0.47
query33	2.91	2.94	2.90
query34	17.10	4.37	4.36
query35	4.41	4.44	4.43
query36	0.65	0.45	0.48
query37	0.19	0.16	0.15
query38	0.16	0.15	0.14
query39	0.04	0.03	0.04
query40	0.15	0.12	0.13
query41	0.09	0.06	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.03
Total cold run time: 109.81 s
Total hot run time: 31.39 s

@yiguolei
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 40076 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 94fe659308739e065aa346c1b461ad59cc8e8b36, data reload: false

------ Round 1 ----------------------------------
q1	19371	5617	4312	4312
q2	2024	191	199	191
q3	10546	1161	1122	1122
q4	10245	850	834	834
q5	7584	2813	2720	2720
q6	225	138	138	138
q7	971	605	617	605
q8	9219	2117	2075	2075
q9	8742	6582	6541	6541
q10	8749	3802	3766	3766
q11	520	237	251	237
q12	401	231	227	227
q13	18834	2941	2988	2941
q14	281	233	244	233
q15	523	489	496	489
q16	494	407	406	406
q17	985	673	689	673
q18	8127	7493	7475	7475
q19	7048	1447	1299	1299
q20	680	324	330	324
q21	4960	3182	3297	3182
q22	360	287	286	286
Total cold run time: 120889 ms
Total hot run time: 40076 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4395	4247	4265	4247
q2	388	287	285	285
q3	3183	2923	2895	2895
q4	2041	1684	1661	1661
q5	5590	5505	5580	5505
q6	231	137	136	136
q7	2229	1824	1901	1824
q8	3277	3444	3423	3423
q9	8866	8811	8791	8791
q10	3973	3895	3939	3895
q11	595	519	502	502
q12	819	649	623	623
q13	16322	3215	3152	3152
q14	305	297	281	281
q15	521	495	481	481
q16	512	452	444	444
q17	1852	1540	1532	1532
q18	8230	7907	7802	7802
q19	1763	1545	1641	1545
q20	2142	1879	1880	1879
q21	9952	4678	4825	4678
q22	584	498	492	492
Total cold run time: 77770 ms
Total hot run time: 56073 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173897 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 94fe659308739e065aa346c1b461ad59cc8e8b36, data reload: false

query1	917	374	367	367
query2	6419	1848	1807	1807
query3	6657	204	219	204
query4	28616	17354	17308	17308
query5	3752	480	483	480
query6	283	166	160	160
query7	4581	294	285	285
query8	231	190	190	190
query9	8638	2404	2381	2381
query10	418	274	278	274
query11	12501	10099	10053	10053
query12	115	88	85	85
query13	1644	373	357	357
query14	10170	7515	8319	7515
query15	220	170	164	164
query16	7404	468	471	468
query17	1140	576	518	518
query18	1821	276	279	276
query19	195	147	154	147
query20	91	82	85	82
query21	204	124	123	123
query22	4347	3995	4155	3995
query23	34024	33744	33785	33744
query24	11032	2945	2906	2906
query25	588	382	418	382
query26	706	152	153	152
query27	2412	280	283	280
query28	6075	2088	2085	2085
query29	912	628	663	628
query30	259	160	150	150
query31	950	782	736	736
query32	95	53	58	53
query33	664	335	346	335
query34	909	511	513	511
query35	922	764	788	764
query36	1162	1007	978	978
query37	149	88	87	87
query38	3016	2913	2789	2789
query39	902	885	856	856
query40	210	121	127	121
query41	47	44	45	44
query42	115	104	101	101
query43	507	457	473	457
query44	1083	730	735	730
query45	191	165	163	163
query46	1101	715	733	715
query47	1825	1770	1787	1770
query48	379	296	292	292
query49	856	444	428	428
query50	788	396	409	396
query51	6859	6683	6498	6498
query52	111	93	99	93
query53	365	300	300	300
query54	888	457	459	457
query55	82	75	76	75
query56	309	295	290	290
query57	1133	1062	1061	1061
query58	264	260	278	260
query59	2811	2525	2764	2525
query60	322	300	307	300
query61	121	119	114	114
query62	792	646	669	646
query63	327	292	294	292
query64	9251	2314	1708	1708
query65	3204	3150	3124	3124
query66	777	337	341	337
query67	15632	14915	14879	14879
query68	6269	562	566	562
query69	764	467	384	384
query70	1138	1104	1165	1104
query71	533	296	288	288
query72	9040	5510	5768	5510
query73	829	329	324	324
query74	6079	5714	5768	5714
query75	4733	2670	2719	2670
query76	4373	930	976	930
query77	758	308	311	308
query78	9725	9176	8997	8997
query79	10528	541	525	525
query80	1085	501	475	475
query81	576	223	218	218
query82	850	171	130	130
query83	329	166	167	166
query84	279	86	86	86
query85	1398	317	301	301
query86	452	301	327	301
query87	3299	3144	3127	3127
query88	5001	2379	2388	2379
query89	550	389	390	389
query90	1994	192	195	192
query91	130	100	105	100
query92	71	53	49	49
query93	7160	523	518	518
query94	1286	287	321	287
query95	413	316	329	316
query96	620	283	275	275
query97	3241	2964	3026	2964
query98	220	202	206	202
query99	1579	1255	1283	1255
Total cold run time: 300793 ms
Total hot run time: 173897 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 94fe659308739e065aa346c1b461ad59cc8e8b36, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.04	0.04
query3	0.22	0.05	0.04
query4	1.67	0.07	0.09
query5	0.49	0.49	0.47
query6	1.13	0.74	0.72
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.56	0.50	0.51
query10	0.54	0.55	0.56
query11	0.14	0.11	0.12
query12	0.15	0.13	0.13
query13	0.59	0.58	0.57
query14	0.76	0.79	0.77
query15	0.85	0.83	0.80
query16	0.37	0.36	0.37
query17	1.04	0.97	0.98
query18	0.24	0.22	0.21
query19	1.77	1.68	1.71
query20	0.01	0.01	0.01
query21	15.40	0.78	0.65
query22	3.18	7.85	2.06
query23	18.29	1.38	1.29
query24	2.08	0.23	0.23
query25	0.17	0.09	0.09
query26	0.28	0.20	0.21
query27	0.45	0.23	0.23
query28	13.26	1.02	1.02
query29	12.62	3.34	3.31
query30	0.26	0.06	0.06
query31	2.86	0.41	0.39
query32	3.27	0.47	0.46
query33	2.89	2.96	2.91
query34	16.99	4.39	4.36
query35	4.41	4.41	4.43
query36	0.64	0.46	0.47
query37	0.20	0.15	0.17
query38	0.15	0.15	0.15
query39	0.04	0.04	0.03
query40	0.14	0.12	0.13
query41	0.10	0.05	0.06
query42	0.06	0.05	0.06
query43	0.06	0.05	0.04
Total cold run time: 108.51 s
Total hot run time: 30.8 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants