-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reservoir coupling: Send start date from slaves to master #5521
base: master
Are you sure you want to change the base?
Conversation
9f5a2d1
to
4825597
Compare
&start_date, | ||
/*count=*/1, | ||
/*datatype=*/MPI_LONG, | ||
/*source_rank=*/0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have never used intercommunicators.
Why is the source 0?
The communicator consists of the processes of the initial communicator an the slave communicator. Who exactly is 0 here?
Note we could use MPI_ANY_SOURCE if we do not know the rank and there is only one message for the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blattms Thanks for reviewing this. (I plan to rebase and update this PR soon.) The source_rank of 0 means that we will be receiving messages from rank 0 in the remote group of the inter communicator. This corresponds to rank 0 in the slave process sending messages to the parent communicator. Yes, we could use message tag MPI_ANY_SOURCE but it might be clearer to use specific tag and document the type of message we will be receiving here.
@@ -101,6 +132,7 @@ void ReservoirCouplingMaster::spawnSlaveProcesses(int argc, char **argv) { | |||
); | |||
this->master_slave_comm_.push_back(std::move(master_slave_comm)); | |||
this->slave_names_.push_back(slave_name); | |||
this->num_slaves_++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DO we need this member? Might be implicitly given by this->slave_names_.size()
// MPI communicators for the slave processes | ||
std::vector<MPI_Comm_Ptr> master_slave_comm_; | ||
std::size_t num_slaves_ = 0; // Initially zero, will be updated in spawnSlaveProcesses() | ||
std::vector<MPI_Comm_Ptr> master_slave_comm_; // MPI communicators for the slave processes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the documentation string correct? I thought this would be an intercommunicator and would inlude all ranks from the original one and the ones from the newly created one ("Intercommunicator between original group and the newly spawned group (handle)." says the documentation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently use these handles to communicate (MPI_Send(), MPI_Recv()) between a specific rank in the master and a specific rank in the slave process. So the comment should maybe use "to" instead of "for" ? Like this: "MPI communicators to the slave processes" ?
4825597
to
96f31a7
Compare
Rebased |
Do not specify slave program name twice when launching slave process
Open MPI does not support output redirection for spawned child processes.
Copy command line parameters from master to slave command line. Also replace data file name in master argv with data file name of the slave.
Remove debug code that was introduced by mistake in the previous commit
Create one log file for each slave subprocess. Redirect both stdout and stderr to this file
Exclude the reservoir coupling stuff if MPI is not enabled
96f31a7
to
53f199b
Compare
Builds on #5453 which should be merged first.
The master process needs to know when each slave starts to determine when to activate it. I am splitting the implementation of the reservoir coupling into smaller PRs like this to simplify the review process.