Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](hdfs-fs)The cache expiration should explicitly release the held fs #38610 #40504

Merged
merged 1 commit into from
Sep 7, 2024

Conversation

CalvinKirs
Copy link
Member

… fs (#38610)

Proposed changes

The RemoteFSPhantomManager class is responsible for managing phantom references of RemoteFileSystem objects, ensuring that the associated FileSystem resources are automatically cleaned up when RemoteFileSystem objects are garbage collected.

Key features:

  • Phantom Reference Monitoring: The class uses a ReferenceQueue and PhantomReference to track RemoteFileSystem objects. When these objects are no longer in use and garbage collected, the class ensures the corresponding FileSystem resources are properly closed to prevent resource leaks.
  • Thread-safe Cleanup: It provides a thread-safe mechanism to start a cleanup thread only once. This thread runs periodically, checking the ReferenceQueue and closing any unused FileSystem resources. Resource Management: The class maintains a map between phantom references and their corresponding FileSystem objects, ensuring that these resources are cleaned up appropriately.
  • The cleanup thread runs at regular intervals, ensuring that any RemoteFileSystem object that is no longer in use is safely removed along with its associated FileSystem resources.

(cherry picked from commit 922ec3a)

Proposed changes

Issue Number: close #xxx

… fs (apache#38610)

## Proposed changes
The RemoteFSPhantomManager class is responsible for managing phantom
references of RemoteFileSystem objects, ensuring that the associated
FileSystem resources are automatically cleaned up when RemoteFileSystem
objects are garbage collected.

Key features:

- Phantom Reference Monitoring: The class uses a ReferenceQueue and
PhantomReference to track RemoteFileSystem objects. When these objects
are no longer in use and garbage collected, the class ensures the
corresponding FileSystem resources are properly closed to prevent
resource leaks.
- Thread-safe Cleanup: It provides a thread-safe mechanism to start a
cleanup thread only once. This thread runs periodically, checking the
ReferenceQueue and closing any unused FileSystem resources.
Resource Management: The class maintains a map between phantom
references and their corresponding FileSystem objects, ensuring that
these resources are cleaned up appropriately.
- The cleanup thread runs at regular intervals, ensuring that any
RemoteFileSystem object that is no longer in use is safely removed along
with its associated FileSystem resources.

(cherry picked from commit 922ec3a)
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@CalvinKirs
Copy link
Member Author

run buildall

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Sep 7, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Sep 7, 2024
Copy link
Contributor

github-actions bot commented Sep 7, 2024

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 49734 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c65e39586ff1a8c826fd4bbaacc72e19a034d1a9, data reload: false

------ Round 1 ----------------------------------
q1	17525	4336	4320	4320
q2	2049	189	147	147
q3	10229	1910	1934	1910
q4	10096	1241	1335	1241
q5	8480	3977	3939	3939
q6	229	127	123	123
q7	2062	1615	1594	1594
q8	9393	2755	2742	2742
q9	10224	10448	10342	10342
q10	8605	3520	3503	3503
q11	424	247	258	247
q12	466	295	297	295
q13	18372	3989	4017	3989
q14	354	326	331	326
q15	515	447	458	447
q16	544	477	452	452
q17	1149	943	951	943
q18	7238	6842	7039	6842
q19	1713	1556	1523	1523
q20	528	304	299	299
q21	4383	4119	4152	4119
q22	506	391	407	391
Total cold run time: 115084 ms
Total hot run time: 49734 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4375	4323	4289	4289
q2	324	218	215	215
q3	4184	4154	4158	4154
q4	2756	2725	2730	2725
q5	7182	7086	7097	7086
q6	233	116	119	116
q7	3247	2782	2839	2782
q8	4330	4379	4449	4379
q9	13750	13644	13525	13525
q10	4266	4276	4253	4253
q11	742	721	687	687
q12	1024	853	865	853
q13	6907	3761	3728	3728
q14	473	415	438	415
q15	496	451	462	451
q16	658	586	584	584
q17	3821	3734	3865	3734
q18	8821	8774	8763	8763
q19	1722	1672	1642	1642
q20	2364	2161	2071	2071
q21	8461	8425	8493	8425
q22	1005	892	957	892
Total cold run time: 81141 ms
Total hot run time: 75769 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 211780 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c65e39586ff1a8c826fd4bbaacc72e19a034d1a9, data reload: false

query1	915	422	374	374
query2	6530	2204	2197	2197
query3	6916	207	200	200
query4	23009	21530	21768	21530
query5	19732	6510	6537	6510
query6	292	217	234	217
query7	4328	306	314	306
query8	265	235	278	235
query9	3082	2664	2595	2595
query10	460	308	296	296
query11	15483	15713	15101	15101
query12	120	71	71	71
query13	1030	439	448	439
query14	17250	13760	13121	13121
query15	376	217	231	217
query16	6474	270	256	256
query17	1728	958	886	886
query18	892	311	310	310
query19	200	154	147	147
query20	80	76	78	76
query21	191	96	95	95
query22	5342	4968	5032	4968
query23	34229	33589	33472	33472
query24	7004	6338	6287	6287
query25	546	431	415	415
query26	800	156	158	156
query27	2309	302	295	295
query28	6067	2297	2283	2283
query29	2860	2817	2816	2816
query30	240	168	166	166
query31	957	734	784	734
query32	71	63	57	57
query33	432	263	265	263
query34	861	494	487	487
query35	1120	940	858	858
query36	1432	1209	1172	1172
query37	88	60	59	59
query38	3051	2941	2918	2918
query39	1375	1314	1329	1314
query40	213	96	96	96
query41	41	37	36	36
query42	87	85	88	85
query43	697	580	642	580
query44	1178	728	722	722
query45	243	230	224	224
query46	1239	973	968	968
query47	1831	1692	1742	1692
query48	505	420	416	416
query49	624	387	384	384
query50	855	624	587	587
query51	4728	4666	4643	4643
query52	104	92	80	80
query53	229	189	180	180
query54	2673	2463	2488	2463
query55	94	87	86	86
query56	229	222	210	210
query57	1143	1039	1079	1039
query58	219	200	210	200
query59	3450	3387	3227	3227
query60	218	203	199	199
query61	98	92	91	91
query62	816	498	469	469
query63	208	177	192	177
query64	3275	1567	1506	1506
query65	3575	3529	3576	3529
query66	785	410	411	410
query67	16189	15411	15913	15411
query68	9000	670	666	666
query69	485	279	274	274
query70	1583	1508	1382	1382
query71	392	313	305	305
query72	6770	4987	4775	4775
query73	758	319	316	316
query74	6330	5884	5789	5789
query75	4627	3685	3640	3640
query76	4662	1168	1218	1168
query77	575	267	255	255
query78	12432	11879	11353	11353
query79	9951	653	634	634
query80	1778	380	381	380
query81	496	246	236	236
query82	1673	95	98	95
query83	159	129	140	129
query84	257	70	71	70
query85	879	314	311	311
query86	334	289	310	289
query87	3235	3011	3025	3011
query88	4950	2331	2340	2331
query89	481	280	298	280
query90	1995	212	215	212
query91	155	139	124	124
query92	57	52	51	51
query93	7245	588	551	551
query94	799	207	208	207
query95	1915	1924	1713	1713
query96	638	332	339	332
query97	6480	6334	6264	6264
query98	228	204	214	204
query99	3062	824	852	824
Total cold run time: 318064 ms
Total hot run time: 211780 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c65e39586ff1a8c826fd4bbaacc72e19a034d1a9, data reload: false

query1	0.02	0.02	0.03
query2	0.07	0.02	0.02
query3	0.25	0.05	0.05
query4	1.78	0.08	0.08
query5	0.53	0.53	0.52
query6	1.26	0.62	0.61
query7	0.02	0.01	0.01
query8	0.04	0.02	0.03
query9	0.51	0.49	0.47
query10	0.54	0.52	0.52
query11	0.13	0.09	0.09
query12	0.11	0.09	0.09
query13	0.62	0.61	0.61
query14	0.79	0.80	0.80
query15	0.78	0.74	0.75
query16	0.36	0.37	0.35
query17	1.00	0.98	0.99
query18	0.25	0.24	0.24
query19	1.88	1.77	1.85
query20	0.02	0.01	0.01
query21	15.46	0.56	0.55
query22	2.00	2.19	1.98
query23	17.15	0.93	0.86
query24	8.37	0.68	0.52
query25	0.39	0.11	0.05
query26	0.82	0.15	0.16
query27	0.04	0.03	0.04
query28	5.65	0.85	0.73
query29	12.79	2.32	2.21
query30	0.58	0.53	0.53
query31	2.81	0.38	0.37
query32	3.38	0.49	0.50
query33	3.08	3.05	3.09
query34	15.29	4.79	4.79
query35	4.86	4.80	4.85
query36	1.07	1.00	1.01
query37	0.06	0.04	0.04
query38	0.03	0.02	0.02
query39	0.02	0.02	0.01
query40	0.16	0.14	0.14
query41	0.05	0.02	0.02
query42	0.02	0.01	0.01
query43	0.02	0.01	0.02
Total cold run time: 105.06 s
Total hot run time: 30.36 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit c65e39586ff1a8c826fd4bbaacc72e19a034d1a9 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       21.4 seconds inserted 10000000 Rows, about 467K ops/s

@xiaokang xiaokang changed the title [Fix](hdfs-fs)The cache expiration should explicitly release the held… [Fix](hdfs-fs)The cache expiration should explicitly release the held fs #38610 Sep 7, 2024
@xiaokang xiaokang merged commit 7347cbc into apache:branch-2.0 Sep 7, 2024
24 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants