Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](cloud) Retry when gRPC to MS throws Status.Code.UNAVAILABLE (shutdown) #41181

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

gavinchou
Copy link
Collaborator

@gavinchou gavinchou commented Sep 23, 2024

It will throw StatusRuntimeException when we try to issue an RPC with a shutdown channel:

thread1 get client(not expired) -> thread2 get client (expired) -> thread2 shutdown the client -> thread1 issues rpc (complains shutdown)

…utdown)

It will throw StatusRuntimeException when we try to issue an RPC with a shutdown channel:
thread1 get client(not expired) -> thread2 get client (expired) -> thread2 shutdown the client -> thread1 issues rpc (complains shutdown)
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@gavinchou
Copy link
Collaborator Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 42031 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 681b378a8103da78a5837ce06772bfbe23d7e686, data reload: false

------ Round 1 ----------------------------------
q1	18038	7504	7363	7363
q2	2022	287	292	287
q3	12279	1139	1249	1139
q4	10571	734	704	704
q5	7742	3138	3130	3130
q6	244	152	151	151
q7	1024	617	628	617
q8	10614	2062	2049	2049
q9	6931	6569	6504	6504
q10	7172	2272	2311	2272
q11	428	254	259	254
q12	428	224	224	224
q13	18298	3085	3058	3058
q14	244	216	218	216
q15	579	544	527	527
q16	962	613	626	613
q17	1352	807	815	807
q18	7375	6767	6652	6652
q19	1407	977	935	935
q20	600	299	280	280
q21	4024	3282	3274	3274
q22	1079	1004	975	975
Total cold run time: 113413 ms
Total hot run time: 42031 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7794	7259	7332	7259
q2	339	240	238	238
q3	3189	2931	3046	2931
q4	2072	1756	1833	1756
q5	5623	5669	5671	5669
q6	229	142	143	142
q7	2174	1848	1827	1827
q8	3314	3468	3442	3442
q9	8893	8749	8702	8702
q10	3501	3506	3478	3478
q11	621	493	496	493
q12	826	651	659	651
q13	14103	3215	3207	3207
q14	323	272	286	272
q15	575	517	513	513
q16	750	681	692	681
q17	1838	1606	1585	1585
q18	8136	7839	7884	7839
q19	1747	1617	1577	1577
q20	2126	1900	1863	1863
q21	5448	5430	5352	5352
q22	1157	1036	1029	1029
Total cold run time: 74778 ms
Total hot run time: 60506 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191148 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 681b378a8103da78a5837ce06772bfbe23d7e686, data reload: false

query1	958	387	401	387
query2	6359	2006	2037	2006
query3	8694	194	199	194
query4	34114	23757	23508	23508
query5	3424	477	473	473
query6	273	169	165	165
query7	4197	291	295	291
query8	281	213	214	213
query9	9516	2661	2638	2638
query10	485	272	276	272
query11	17986	15188	15283	15188
query12	151	94	94	94
query13	1538	426	409	409
query14	9820	6691	7554	6691
query15	251	170	171	170
query16	8079	442	506	442
query17	1738	600	611	600
query18	2197	312	346	312
query19	360	153	150	150
query20	124	110	117	110
query21	219	108	109	108
query22	4901	4528	4412	4412
query23	34958	34508	34467	34467
query24	11323	2966	2892	2892
query25	634	405	400	400
query26	1141	162	160	160
query27	2358	283	295	283
query28	7838	2430	2447	2430
query29	839	436	425	425
query30	259	156	157	156
query31	1028	782	797	782
query32	94	51	56	51
query33	769	308	292	292
query34	943	495	494	494
query35	884	725	725	725
query36	1088	935	961	935
query37	170	94	90	90
query38	4140	3885	3948	3885
query39	1482	1476	1405	1405
query40	205	97	99	97
query41	52	50	52	50
query42	119	94	97	94
query43	518	476	467	467
query44	1299	832	806	806
query45	196	163	165	163
query46	1152	784	774	774
query47	1878	1828	1863	1828
query48	474	369	363	363
query49	899	410	421	410
query50	842	400	411	400
query51	7002	6786	6965	6786
query52	99	87	88	87
query53	260	178	202	178
query54	1164	451	465	451
query55	78	79	80	79
query56	280	261	258	258
query57	1203	1100	1089	1089
query58	228	235	245	235
query59	3170	2888	2817	2817
query60	295	277	272	272
query61	110	107	106	106
query62	818	656	650	650
query63	220	184	177	177
query64	3932	652	620	620
query65	3216	3224	3183	3183
query66	770	294	298	294
query67	15723	15694	15635	15635
query68	4452	589	580	580
query69	566	297	296	296
query70	1159	1152	1111	1111
query71	407	280	286	280
query72	7419	4129	4120	4120
query73	765	337	332	332
query74	10521	9049	9035	9035
query75	3882	2659	2694	2659
query76	3392	950	926	926
query77	573	286	285	285
query78	10033	9103	9475	9103
query79	1520	555	537	537
query80	1233	443	446	443
query81	597	243	245	243
query82	616	145	138	138
query83	296	140	133	133
query84	271	76	79	76
query85	1653	287	290	287
query86	434	301	296	296
query87	4495	4351	4274	4274
query88	3422	2337	2333	2333
query89	394	278	287	278
query90	2139	190	188	188
query91	175	145	141	141
query92	66	49	49	49
query93	1555	544	543	543
query94	1220	292	285	285
query95	350	255	254	254
query96	620	278	279	278
query97	3221	3111	3124	3111
query98	221	193	190	190
query99	1502	1270	1297	1270
Total cold run time: 300984 ms
Total hot run time: 191148 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 681b378a8103da78a5837ce06772bfbe23d7e686, data reload: false

query1	0.05	0.04	0.04
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.65	0.10	0.09
query5	0.50	0.49	0.51
query6	1.16	0.73	0.72
query7	0.02	0.01	0.02
query8	0.04	0.04	0.03
query9	0.56	0.51	0.48
query10	0.55	0.57	0.54
query11	0.14	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.58	0.58
query14	2.94	2.94	3.04
query15	0.89	0.81	0.83
query16	0.41	0.38	0.38
query17	1.07	1.00	1.00
query18	0.19	0.19	0.19
query19	1.90	1.88	2.04
query20	0.01	0.02	0.01
query21	15.37	0.59	0.56
query22	2.75	2.21	2.32
query23	17.22	0.78	0.82
query24	2.44	0.95	1.10
query25	0.18	0.13	0.13
query26	0.39	0.14	0.14
query27	0.03	0.04	0.04
query28	11.01	1.11	1.06
query29	12.56	3.27	3.22
query30	0.25	0.06	0.06
query31	2.87	0.38	0.37
query32	3.29	0.47	0.46
query33	2.96	3.06	3.02
query34	17.21	4.41	4.41
query35	4.42	4.40	4.47
query36	0.66	0.49	0.47
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.03	0.02	0.03
query40	0.16	0.13	0.13
query41	0.08	0.02	0.03
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.21 s
Total hot run time: 33.09 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 24, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gavinchou gavinchou merged commit 47c5eeb into apache:master Sep 25, 2024
27 of 30 checks passed
dataroaring pushed a commit that referenced this pull request Oct 9, 2024
…utdown) (#41181)

It will throw StatusRuntimeException when we try to issue an RPC with a
shutdown channel:
```
thread1 get client(not expired) -> thread2 get client (expired) -> thread2 shutdown the client -> thread1 issues rpc (complains shutdown)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants