-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp CI with dependency and Python caching for efficient installs #3112
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @johvincau for this PR! The caching is nice and the job names are so much better than what we have currently (and will really help when sifting through failed jobs). A few suggestions for you:
|
||
# NJOY variables | ||
NJOY_REPO='https://github.com/njoy/NJOY2016' | ||
NJOY_INSTALL_DIR=$HOME/NJOY2016 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to put all the installed dependencies in a single folder, e.g., $HOME/dependencies/NJOY2016
. Also, in the case of NJOY (and probably the others), we should just be caching the installed version. Right now, the cache for NJOY is 1 GB when it should be much less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've trimmed the NJOY repository by ~30% by removing its own NJOY-specific testing suite out of the cache; however, removing the build directory causes issues with the vectfit installation. Aside from NJOY and vectfit, all other of the dependencies at the very least remove their own build directories before caching.
I would like to move all the installed dependencies into a single folder as well; however, I will address that in a future PR since it would take a lot of testing and plenty of commits (and this PR already has enough of those as is!).
Continuous Integration (CI)
This PR resolves #2140, revamping the CI suite by adding cache support for external dependencies as well as Python environments for faster install times, while also preparing for future expansions with better readability and other QOL tweaks.
(note: There was a time when installing each version of OpenMC could take several hours owing to inefficiencies in Github's management of its runners. While this is no longer the case, even without this PR, I think no one would miss the time we spend building and installing these dependencies when we can now cache and restore them very quickly.)
List of Changes
1. Caches for built NJOY2016, DAGMC, libmesh, vectfit, ncrystal dependencies
We now have the ability to cache most of the external dependencies, which can save significantly on install time since they take a while to build. For the bigger dependencies like DAGMC and vectfit, the
install
step is cut by more than half.In terms of staying up-to-date,
gha-caches.sh
fetches the latest commit hashes for the required dependencies. If the cache we have on hand does not match the latest commit hash, we will rebuild the dependency from the upstream source and overwrite the old cache.One point I'd like to ask about is whether we should retain the cmake and build directories for each of the installed dependencies in our caches. The CI seems to get along just fine without those two, and only the source + executable -- at least for the DAGMC case, the only one I've tried yet.
2. Job-specific Python environment caches
This addition caches a unique Python environment for each of the 12 configurations, which further saves us install time. This is a list of the caches produced by this.
We have the spare cache storage for all of these. If, in the future, we need the space, simply distinguishing between omp, mpi, and omp+mpi installs of Python 3.10 would also be acceptable.
3. Setup and matrix.json for job-name readability
I've added a "setup" job which reads from
matrix.json
all configurations of OpenMC to be tested. All 12 of the previous configurations have been added, along with a "job-name" variable which makes the CI screen more readable. This is what it now looks like:While this file will have to be manually maintained, it is at least more straightforward to add new configurations for testing.
4. Standardize across gha-install scripts
Some gha-install scripts needed to have
-DCMAKE_INSTALL_PREFIX
specified for caching to work, and for the executables to be accessible by the OpenMC install process.5. ENDF-B/VII.1 cross section cache rename
I've renamed the cache to provide clarity on which nuclear data files we use for continuous integration -- being mindful of the possibility that we may need other files like ENDF-B/VIII.x for new tests in the future.
Checklist
The commit messages may have some inconsistencies. I've had to commit and push every time to test the CI, and I've attempted to clean up with interactive rebasing.