-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] ENH: DAOS and DFS modules #1014
Open
shanedsnyder
wants to merge
60
commits into
main
Choose a base branch
from
snyder/dev-daos-module-3.4
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* add CFFI shims needed to access DFS record data at the Python level * adjust `test_main_all_logs_repo_files()` to handle the new `ior` `DFS` log file from Shane--it has a single runtime heatmap for `STDIO` * `test_module_table()` has been updated with a regression case for Shane's new DFS log file * add `test_dfs_daos_posix_match()` to ensure counter equivalence between similar `ior..` runs with DAOS vs. POSIX (NOTE: these actually don't look that similar yet--xfailed for now..)
* adjust `test_dfs_daos_posix_match()` to handle the two new POSIX/DAOS "mirror files" from Shane; the `xfail` has been removed and it now passes * there seems to be soem reasonable agreement between the logs, which is good; see the test proper for data columns that do not match or required special handling for DFS-POSIX equivalence testing * a few other test suite shims after Shane changed the POSIX/DAOS mirror files
* add DFS support to I/O cost graph in summary reports, with some light unit testing
* add a DFS per-module stats section to the Python summary report, and some initial tests
* simplify the "time" counter handling in `test_dfs_daos_posix_match()` based on reviewer feedback * `DFS_SLOWEST_RANK` is ignored in the comparisons in `test_dfs_daos_posix_match()` based on reviewer feedback * the comment about `STAT` counter differences in `test_dfs_daos_posix_match` was removed, based on reviewer feedback
The OID backing a DFS file can change if the file is deleted and recreated.
This reverts commit c6e6936.
we don't currently have a way to generate darshan record IDs given only a pathname -- they are based on OIDs
* requires interception of `daos_cont_open` routines to allow mapping of container handles to pool/cont UUIDs * DAOS module record ID now based on OID, cont UUID, and pool UUID
* add logic to allow name records with zero-length names to be updated with names in later register_record calls - this is useful because DAOS/DFS generate the same record IDs for "file objects", but the DAOS module does not register a name with the record and registers the record before DFS module
when reading name records from the log file, allow for updating an existing zero-length name record
also, cleanup file/object terminology in job summary
shanedsnyder
changed the title
ENH: DAOS and DFS modules
[WIP] ENH: DAOS and DFS modules
Nov 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds new instrumentation of DAOS storage APIs and corresponding updates to our analysis tools to integrate this DAOS data. Specifically, 2 new Darshan modules are defined:
DARSHAN_DFS_MOD
for instrumenting usage of the DAOS file system (DFS) API andDARSHAN_DAOS_MOD
for instrumenting native DAOS object APIs. More details on each module below.DFS module:
dfs_obj_global2local()
), meaning not all processes will have the filename available to generate a consistent record ID -- using the object OID allows all processes to agree on a consistent record ID value.pool_uuid:cont_uuid
combo is used in place of themount pt
in tools likedarshan-parser
.Example
darshan-parser
output line:DAOS module:
DAOS_OBJ
), array (DAOS_ARRAY
), and KV (DAOS_KV
).oid_hi.oid_lo
, same approach as DAOS's own utilities)pool_uuid:cont_uuid
combo is used in place of themount pt
in tools likedarshan-parser
.Example
darshan-parser
output line:Both DFS and DAOS modules integrate with the Darshan heatmap module to generate histograms of I/O activity on each process. Both DFS and DAOS modules have also fully implemented darshan-util and PyDarshan functionality, including support for generating PyDarshan summary reports detailing DFS/DAOS access patterns. PyDarshan tests have been updated to ensure expected behavior when parsing logs containing DFS/DAOS data.
There are a few outstanding items that are not addressed in this PR:
Replaces #739