-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI-parallelization of sktwocnt #94
base: main
Are you sure you want to change the base?
Conversation
432b601
to
51cbe34
Compare
Rebased on #90 to incorporate shelf search bug fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is rebased after the merge of #90 I'll have another look.
- name: Compile and Install libXC | ||
run: | | ||
git clone https://gitlab.com/libxc/libxc.git | ||
cd libxc/ | ||
git checkout 6.2.2 | ||
cmake -DCMAKE_INSTALL_PREFIX=${PWD}/${BUILD_DIR}/${INSTALL_DIR} -DENABLE_FORTRAN=True -B ${BUILD_DIR} . | ||
cd ${BUILD_DIR} | ||
make -j 2 | ||
make -j2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to just use all available resources? (likewise for the ctest commands)
make -j2 | |
make -j |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there is hyper-threading on the VMs. If yes, I would prefer to keep the -j2
version to only utilize physical cores.
if: contains(matrix.mpi, 'openmpi') || contains(matrix.mpi, 'mpich') | ||
run: | | ||
pushd ${BUILD_DIR} | ||
ctest -j1 --output-on-failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why -j1 ?
ctest -j1 --output-on-failure | |
ctest --output-on-failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the -DTEST_MPI_PROCS=2
run. Is there any difference between ctest
and ctest -j1
? The latter might be a bit more verbose but maybe a bit more transparent too. I don't really care.
sktwocnt/prog/input.f90
Outdated
write(stdOut, "(A,A)") "!!! Parsing error: ", txt | ||
write(stdOut, "(2X,A,A)") "File: ", trim(fname) | ||
write(stdOut, "(2X,A,I0)") "Line number: ", iLine | ||
write(stdOut, "(2X,A,A,A)") "Line: '", trim(line), "'" | ||
|
||
stop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be an MPI abort first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed by the latest commit.
@@ -17,8 +18,19 @@ subroutine write_sktables(skham, skover) | |||
!> Hamiltonian and overlap matrix | |||
real(dp), intent(in) :: skham(:,:), skover(:,:) | |||
|
|||
call write_sktable_("at1-at2.ham.dat", skham) | |||
call write_sktable_("at1-at2.over.dat", skover) | |||
if (size(skham, dim=2) > 0) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I've followed the logic fully, but what happens if non-lead processes get here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't/shouldn't call this routine. If they would, all processes will write exactly the same file, with the same name and content (don't know how the filesystem handles this), as all the data is present on all ranks.
Co-authored-by: Ben Hourahine <[email protected]>
Co-authored-by: Ben Hourahine <[email protected]>
Straight forward (non-distributed) and optional MPI parallelization of the
sktwocnt
binary, employing the following strategy:fixed maximum distance (static tabulation)
Creates a single batch of dimer distances that are computed by available MPI ranks. Nothing fancy, MPI ranks exceeding the total number of distances are idling. We might want to print a warning or even block such calculations.
dynamic batches with maximum distance (2 cases: converged or max. distance reached)
If the number of MPI ranks undercuts the number of dimer distances contained in the default 1 Bohr batch length, the distances are computed by the different ranks. If the number of ranks exceeds the number of distances within the default 1 Bohr batch length, the batch size is automatically increased to accomodate as many dimer distances as there are MPI ranks. By doing so we mostly avoid ranks idling around, but an increased batch size later requires the Hamiltonian and overlap data to be cropped to the same size one would have obtained with the default batch length of 1 Bohr. Otherwise the length of the converged SK-tables would depend on the number of ranks.
To be merged after #90 (needs to be rebased).