Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JEDI CI container specs update and FMS update for JEDI-FV3 #1099

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented May 2, 2024

Summary

  1. Add missing packages to JEDI-CI specs for container builds (needed by JEDI CI system)
  2. Update FMS from release-jcsda to 2023.04 for jedi-fv3-env and gmao-swell-env (the latter to be confirmed by GMAO). This is needed for the FMS + FV3-DYCORE updates in fv3-jedi etc.

Testing

  1. Built gcc-openmpi container manually on EC2
  2. Needs testing in JEDI-CI framework after test containers have been built
  3. Needs testing with gmao-swell-env on Discover (if change is ok for GMAO)

Applications affected

JEDI CI (JEDI-FV3)

Systems affected

Container builds

Dependencies

n/a

Issue(s) addressed

n/a

Checklist

  • This PR addresses one issue/problem/enhancement, or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.

@climbfuji climbfuji self-assigned this May 2, 2024
@climbfuji climbfuji added the INFRA JEDI Infrastructure label May 2, 2024
@climbfuji climbfuji requested a review from eap May 2, 2024 20:28
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@climbfuji
Copy link
Collaborator Author

@Dooruk and/or @rtodling Do you know if it is ok to switch gmao-swell-env to [email protected] ? I believe it is, because even the JCSDA version used in gmao-swell-env now is newer than what GEOS uses. Thanks!

@Dooruk
Copy link
Collaborator

Dooruk commented May 3, 2024

@climbfuji this shouldn't impact Swell. However @rtodling mentioned there could be some implications in terms of linking GEOSgcm and JEDI (running GEOS through JEDI) work. That is in a different env (geos-gcm-env) but let me tag @mathomp4 here just in case fms change somehow impacts his work?

@climbfuji
Copy link
Collaborator Author

@climbfuji this shouldn't impact Swell. However @rtodling mentioned there could be some implications in terms of linking GEOSgcm and JEDI (running GEOS through JEDI) work. That is in a different env (geos-gcm-env) but let me tag @mathomp4 here just in case fms change somehow impacts his work?

You are right, geos-gcm-env doesn't load any fms by default, so that shouldn't be a problem either

@climbfuji
Copy link
Collaborator Author

@eap Is this working for you? Should I open it pul for reviews and merging? Thanks!

@eap
Copy link
Collaborator

eap commented May 6, 2024

@climbfuji I've been building these today (and likely this will continue overnight) - I should be able to test the lot of them in our "ci-next" environment on Tuesday and I'll report back.

@climbfuji climbfuji marked this pull request as ready for review May 7, 2024 13:53
@eap
Copy link
Collaborator

eap commented May 8, 2024

Well, this is working for CI but the fms change looks like it has broken the tests. All three build environments are failing in the same way;

[ 22%] Built target test_iodaconv_obserror.x
Scanning dependencies of target fv3
[ 22%] Building Fortran object fv3/CMakeFiles/fv3.dir/model/fv_arrays.F90.o
/workdir/test_root/jedi-bundle/fv3/model/fv_arrays.F90:26:2:

   26 |   use mpp_domains_mod,       only: domain2d
      |  1~~~~~~~~~~~~~~~
Fatal Error: fms_platform.h: No such file or directory
compilation terminated.
make[2]: *** [fv3/CMakeFiles/fv3.dir/build.make:114: fv3/CMakeFiles/fv3.dir/model/fv_arrays.F90.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:28576: fv3/CMakeFiles/fv3.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 22%] Linking CXX executable ../../../bin/print_queries.x
[ 22%] Built target print_queries.x
make: *** [Makefile:166: all] Error 2

Here's a direct link to the failures: https://github.com/JCSDA-internal/mpas-jedi/pull/966/checks?check_run_id=24712175023

@climbfuji
Copy link
Collaborator Author

That is expected. We need this version of fms as "next" containers for a set of PRs

@@ -180,7 +185,8 @@ spack:
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null && \
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list && \
apt update && \
apt install intel-oneapi-compiler-dpcpp-cpp-and-cpp-classic-2022.1.0 intel-oneapi-compiler-fortran-2022.1.0 intel-oneapi-mkl-devel-2022.1.0 intel-oneapi-mpi-devel-2021.6.0 -y
apt install intel-oneapi-compiler-dpcpp-cpp-and-cpp-classic-2022.1.0 intel-oneapi-compiler-fortran-2022.1.0 intel-oneapi-mkl-devel-2022.1.0 intel-oneapi-mpi-devel-2021.6.0 python3-nacl -y && \
rm -rf /var/lib/apt/lists/*
Copy link
Collaborator

@eap eap May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I've added this rm command to each "RUN" directive with apt update since it reduces the layer size.

@climbfuji
Copy link
Collaborator Author

Well, this is working for CI but the fms change looks like it has broken the tests. All three build environments are failing in the same way;

[ 22%] Built target test_iodaconv_obserror.x
Scanning dependencies of target fv3
[ 22%] Building Fortran object fv3/CMakeFiles/fv3.dir/model/fv_arrays.F90.o
/workdir/test_root/jedi-bundle/fv3/model/fv_arrays.F90:26:2:

   26 |   use mpp_domains_mod,       only: domain2d
      |  1~~~~~~~~~~~~~~~
Fatal Error: fms_platform.h: No such file or directory
compilation terminated.
make[2]: *** [fv3/CMakeFiles/fv3.dir/build.make:114: fv3/CMakeFiles/fv3.dir/model/fv_arrays.F90.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:28576: fv3/CMakeFiles/fv3.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 22%] Linking CXX executable ../../../bin/print_queries.x
[ 22%] Built target print_queries.x
make: *** [Makefile:166: all] Error 2

Here's a direct link to the failures: https://github.com/JCSDA-internal/mpas-jedi/pull/966/checks?check_run_id=24712175023

@shlyaeva Note that @eap has new containers with the updated FMS. We need those for the FV3 dycore update branches.

@eap
Copy link
Collaborator

eap commented May 9, 2024

@shlyaeva - in order to use the new containers add jedi-ci-next=true to a line in your pull request description.

@climbfuji
Copy link
Collaborator Author

@shlyaeva - in order to use the new containers add jedi-ci-next=true to a line in your pull request description.

I don't think this works. It needs a different branch of jedi-bundle, too.

@climbfuji climbfuji merged commit bbf12a3 into JCSDA:develop May 9, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INFRA JEDI Infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants