-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CNEUR-379] Use /dev/shm
as a cache in multinode simulations
#15
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
neurodamus/node.py
Outdated
|
||
group_id = int(SHMUtil.node_id / 20) | ||
node_specific_corenrn_output_in_storage = \ | ||
Path(corenrn_output) / f"coreneuron_input/cycle_{self._cycle_i}/group_{group_id}/node_{SHMUtil.node_id}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Pasting from Slack)
Thanks to a reminder by @1uc, I completely forgot that I discussed with
@iomaganaris about generating the output in subfolders on GPFS, and that he already had something working recently 😅. Hence, I made some changes to improve it a bit by dividing the coreneuron_input
directory into:
coreneuron_input/cycle_X/group_Y/node_Z
Being cycle_X
the current cycle in the model instantiation, node_Z
the node ID (i.e., from 0 to 799 in the 800-node simulation), and group_Y
a simple division to group the set of nodes in subfolders dividided by 20 (i.e., group_id=floor(node_id / 20)
). Why 20? Well, a magic number just to divide the amount of folders into something reasonable.
With this simple approach, inside the coreneuron_input
there would be at most 32 folders (i.e., one per cycle). Inside each cycle folder, there would be at most 40 subfolders (i.e., 800/20 = 40, one per subset of nodes). Inside the group of nodes, there would be at most 20 subfolders corresponding to the node IDs (i.e., from 0 to 19 in the first group, 20 to 39 in the second, and so on). Finally, inside the specific node folder, there would be at most 120 files from CoreNEURON (i.e., 3 files_per_rank x 40 ranks_per_node = 120).
In other words, we go from a folder with 3.1M files, to a tree that is much more manageable by GPFS and IME, regardless of the target file system that we use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there has been a slight update in the code since this comment was written, but the reasoning is still valid.
This comment has been minimized.
This comment has been minimized.
aadc918
to
9bb0951
Compare
This comment has been minimized.
This comment has been minimized.
9bb0951
to
9a57819
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
9a57819
to
ed01a14
Compare
ed01a14
to
8a285f4
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
69955e3
to
21baca3
Compare
This comment was marked as outdated.
This comment was marked as outdated.
neurodamus/node.py
Outdated
@@ -1694,17 +1695,6 @@ def cleanup(self): | |||
data_folder_shm = SHMUtil.get_datadir_shm(data_folder) | |||
logging.info("Deleting intermediate SHM data in %s", data_folder_shm) | |||
subprocess.call(['/bin/rm', '-rf', data_folder_shm]) | |||
# Remove also the coreneuron_input_{node_id} folders |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we delete the symlinks in /dev/shm
and folders in GPFS in the end? @sergiorg-hpc
If I'm not mistaken @sergiorg-hpc has been working in an improved version of this fix so this PR can now be closed? |
We discussed about this change with Sergio offline and we concluded that it's something that it's not necessary for MMB simulations but might be beneficial so until needed this can be stayed open and Sergio can continue working on it if necessary |
Currently in multinode simulations without
/dev/shm
all node write thecoreneuron_input
files in the same folder. This hurts a lot theGPFS
performance for all users while this is happening.Instead of all ranks writing to GPFS this PR adds a
CACHE
mode for/dev/shm
where thecoreneuron_input
data are first staged to/dev/shm
and then written by a single run in different folders in GPFS namedcoreneuron_input_{node_id}
.Then there
symlinks
are created fromcoreneuron_input_{node_id}/*_{1,2,3}.dat
to/dev/shm/.../coreneuron_input
and thenCoreNEURON
launches the simulation using/dev/shm/.../coreneuron_input
.The only drawback of this approach is that there extra memory needed to use the
/dev/shm
cache since everything we dump to/dev/shm
is saved in the RAM of the node