Optimization of AMR with MPI in case of TreeMesh #1532
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Now if an AMRCallback wants to refine/coarse a TreeMesh, every MPI process have to do all amount of work by itself.
I started optimization with coarsen.
Idea is to distribute cells to coarsen between all MPI ranks (to the one to which it belongs). And then to collect the TreeMeshes from all ranks to root and join them. After completing this job, the root sends it to all other processes.
Trixi.jl/src/meshes/abstract_tree.jl
Line 502 in 9119f8d
Hotspots of this algorithm are MPI communications, that synchronize processes, so some of them have to wait others. Also to transfer a whole TreeMesh takes time and memory. I'm still looking for a way to improve efficiency of MPI communications
Algorithm of uniting TreeMeshes looks like difficult due to the specific conversion of all ids, but seems to be quite fast.
I'm not sure that it will work for all elixirs, but I didn't found such example yet.
Of coarse, if this algorithm will make sense in terms of efficiency, I will re-write it in more easy-to-understand way.