Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Concurrent Access Issues with spack-{packages,config} #59

Open
CodeGat opened this issue Apr 23, 2024 · 5 comments
Open

Potential Concurrent Access Issues with spack-{packages,config} #59

CodeGat opened this issue Apr 23, 2024 · 5 comments

Comments

@CodeGat
Copy link
Contributor

CodeGat commented Apr 23, 2024

In both /g/data/vk83/apps/spack/*/spack and /g/data/vk83/prerelease/apps/spack/*/spack (although it will be much more pronounced in the latter), we force checkout a potentially different spack-{packages,config} for the installation of a spack.yaml. If multiple builds are happening at the same time, one build could force checkout to a different spack-{packages,config} during the installation of a different build.

At the model level, we can enforce that only one run can access the Gadi [Prerelease] environment at once with concurrency: <environment> directive at the workflow level. But this doesn't help when multiple different models are accessing the given environment - there's no way to enforce it like that in GitHub. GitHub can only do repo-level concurrency things.

Possibly, we might need to have a spack-{packages,config} per model version. For example:

/g/data/vk83/*/apps/spack/<version>/
├── spack
├── release
├── spack-packages/
│   ├── access-om2/
│   │   ├── 2023.11.23/
│   │   │   └── packages <etc>
│   │   ├── 2024.03.0/
│   │   │   └── packages <etc>
│   │   ├── pr15-1/
│   │   │   └── packages <etc>
│   │   └── pr15-2/
│   │       └── packages <etc>
│   ├── access-om2-bgc/
│   │   └── 2024.03.0/
│   │       └── packages <etc>
│   ├── access-om3/
│   │   └── ...
│   └── access-esm1.5/
│       └── ...
└── spack-config/
    ├── access-om2/
    │   ├── 2023.11.23/
    │   │   ├── common
    │   │   ├── tools
    │   │   ├── v0.20
    │   │   ├── v0.21
    │   │   └── <other contents of spack-config>
    │   ├── 2024.03.0/
    │   │   └── common <etc>
    │   ├── pr15-1/
    │   │   └── common <etc>
    │   └── pr15-2/
    │       └── common <etc>
    ├── access-om2-bgc/
    │   └── 2024.03.0/
    │       └── common <etc>
    ├── access-om3/
    │   └── ...
    └── access-esm1.5/
        └── ...

This would require changes to a bunch of things:

  • spack-configs symlinks would be 3 directories deep, rather than 1 (pinging @harshula)
  • spack-packages would also be 3 directories deep (this might have implications for the above, too)
  • CI/CD would have to git -C $spack/../spack-{packages,config}/${{ inputs.model }}/${{ inputs.version }} checkout --force ${{ inputs.spack-{packages,config} }} @CodeGat
  • CI/CD for prerelease deployment cleanup would have to delete the directories that correspond to the model+version that are being cleaned up @CodeGat

However, the issue still remains with the symlinks into $spack/etc/spack/ from spack-config. We've separated the repos into concurrently-accessible bits, but it seems that we've just moved the problem to the symlinks in spack-config - we only have one install of spack that can be symlinked to one spack-{packages,config} at a time. If another build changes those symlinks during installation, we're back where we started. Hmm...

Pinging @aidanheerdegen because he loves this kind of stuff

@aidanheerdegen
Copy link
Member

I think we can assume we can always rebuild a given model, as we have the spack.yaml, the spack.lock and the versions of spack, spack-packages and spack-config required to do so.

So we don't need to keep lots of versions of these things around. In fact, they can be entirely ephemeral. So .. we could shallow clone all of them when required into TMP, use them and then clean them up when deployed successfully.

No?

Random thought: can we make use of the view in environments to take care of some of this for us?

In the documentation it says:

In addition to being the default location for the view associated with an Environment, the .spack-env directory also contains:

  • repo/: A repo consisting of the Spack packages used in this environment. This allows the environment to build the same, in theory, even on different versions of Spack with different packages!

Could we create the environment on a GitHub runner and then copy that down to gadi with the repo data embedded?

@CodeGat
Copy link
Contributor Author

CodeGat commented Apr 24, 2024

So .. we could shallow clone all of them when required into TMP, use them and then clean them up when deployed successfully

That would be fine for spack-packages, but spack-config will need to still have it's hooks in the install of spack during install, which could be modified

@aidanheerdegen
Copy link
Member

So .. we could shallow clone all of them when required into TMP, use them and then clean them up when deployed successfully

That would be fine for spack-packages, but spack-config will need to still have it's hooks in the install of spack during install, which could be modified

It's a good point. My brain hurts.

@CodeGat
Copy link
Contributor Author

CodeGat commented Apr 26, 2024

During an online meeting:

  • For release: We shouldn't do force checkouts anymore on the spack for releases, it's too error prone and creates the above issues. Instead, a config file on Gadi (spack, packages, config version) with what spack is using right now. We should allow people to deploy whatever to prerelease, but check that it is the same as release before they can merge.
  • For prerelease: Have a version of spack for each (spack version, spack-packages version, spack-config version) separated by directory. For example, .../prerelease/apps/spack/0.21/2024.04.20-2024.02.31/spack. Teardown these when spack env list has nothing in it.
    @aidanheerdegen and @harshula

@CodeGat
Copy link
Contributor Author

CodeGat commented May 1, 2024

Will require changes based on ACCESS-NRI/spack-config#30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants