Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reservoir coupling: Send start date from slaves to master #5521

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

hakonhagland
Copy link
Contributor

Builds on #5453 which should be merged first.

The master process needs to know when each slave starts to determine when to activate it. I am splitting the implementation of the reservoir coupling into smaller PRs like this to simplify the review process.

&start_date,
/*count=*/1,
/*datatype=*/MPI_LONG,
/*source_rank=*/0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never used intercommunicators.

Why is the source 0?
The communicator consists of the processes of the initial communicator an the slave communicator. Who exactly is 0 here?

Note we could use MPI_ANY_SOURCE if we do not know the rank and there is only one message for the tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blattms Thanks for reviewing this. (I plan to rebase and update this PR soon.) The source_rank of 0 means that we will be receiving messages from rank 0 in the remote group of the inter communicator. This corresponds to rank 0 in the slave process sending messages to the parent communicator. Yes, we could use message tag MPI_ANY_SOURCE but it might be clearer to use specific tag and document the type of message we will be receiving here.

@@ -101,6 +132,7 @@ void ReservoirCouplingMaster::spawnSlaveProcesses(int argc, char **argv) {
);
this->master_slave_comm_.push_back(std::move(master_slave_comm));
this->slave_names_.push_back(slave_name);
this->num_slaves_++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO we need this member? Might be implicitly given by this->slave_names_.size()

// MPI communicators for the slave processes
std::vector<MPI_Comm_Ptr> master_slave_comm_;
std::size_t num_slaves_ = 0; // Initially zero, will be updated in spawnSlaveProcesses()
std::vector<MPI_Comm_Ptr> master_slave_comm_; // MPI communicators for the slave processes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the documentation string correct? I thought this would be an intercommunicator and would inlude all ranks from the original one and the ones from the newly created one ("Intercommunicator between original group and the newly spawned group (handle)." says the documentation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently use these handles to communicate (MPI_Send(), MPI_Recv()) between a specific rank in the master and a specific rank in the slave process. So the comment should maybe use "to" instead of "for" ? Like this: "MPI communicators to the slave processes" ?

@hakonhagland
Copy link
Contributor Author

Rebased

Do not specify slave program name twice when launching slave process
Open MPI does not support output redirection for spawned child
processes.
Copy command line parameters from master to slave command line. Also
replace data file name in master argv with data file name of the slave.
Remove debug code that was introduced by mistake in the previous commit
Create one log file for each slave subprocess. Redirect both
stdout and stderr to this file
Exclude the reservoir coupling stuff if MPI is not enabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants