Skip to content

Commit

Permalink
Update checks for singularity/apptainer and nextflow environment vari…
Browse files Browse the repository at this point in the history
…ables

Improves checks for the environment variables for
singularity/apptainer/nextflow cache and tmp dirs:
- APPTAINER_TMPDIR/SINGULARITY_TMPDIR,
- APPTAINER_CACHEDIR/SINGULARITY_CACHEDIR or
- NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR
Only the nextflow cache is set to a default value if it is not set in the
launch environment (via singularity.cacheDir). Before, this location
was always set and any value set in the env var was ignored.
The other two cannot be overriden in the env scope because that only affects
the task environment variables, not the environment in which nextflow is launched.
See: https://www.nextflow.io/docs/latest/config.html#scope-env
Warnings are issued for the user when any of these variables are not set.
Also clarifies code comments and updates documentation.
  • Loading branch information
pmoris committed Aug 7, 2024
1 parent 5e483a6 commit 915bb0f
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 14 deletions.
42 changes: 33 additions & 9 deletions conf/vsc_calcua.config
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,28 @@ cleanup = true
// Check if APPTAINER_TMPDIR/SINGULARITY_TMPDIR environment variables are set.
// If they are available, try to create the tmp directory at the specified location.
// Skip if host is not CalcUA to avoid hindering github actions.
if ( System.getenv("VSC_INSTITUTE") != "antwerpen" ) {
def apptainer_tmpdir = System.getenv("APPTAINER_TMPDIR") ?: System.getenv("SINGULARITY_TMPDIR") ?: null
if (! apptainer_tmpdir ) {

// Check if environment variables for singularity/apptainer/nextflow cache and tmp dirs are set:
// - APPTAINER_TMPDIR/SINGULARITY_TMPDIR,
// - APPTAINER_CACHEDIR/SINGULARITY_CACHEDIR or
// - NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR
// Define variables outside of conditional scope to make them usable elsewhere
def apptainer_tmpdir = System.getenv("APPTAINER_TMPDIR") ?: System.getenv("SINGULARITY_TMPDIR") ?: null
def apptainer_cachedir = System.getenv("APPTAINER_CACHEDIR") ?: System.getenv("SINGULARITY_CACHEDIR") ?: null
def nxf_apptainer_cachedir = System.getenv("NXF_APPTAINER_CACHEDIR") ?: System.getenv("NXF_SINGULARITY_CACHEDIR") ?: null
// Skip check if host is not CalcUA, to avoid hindering github actions.
if ( System.getenv("VSC_INSTITUTE") == "antwerpen" ) {
// APPTAINER_TMPDIR/SINGULARITY_TMPDIR environment variable
if ( !apptainer_tmpdir ) {
def tmp_dir = System.getenv("TMPDIR") ?: "/tmp"
System.err.println("\nWARNING: APPTAINER_TMPDIR/SINGULARITY_TMPDIR environment variable was not found.\nPlease add the line 'export APPTAINER_TMPDIR=\"\${VSC_SCRATCH}/apptainer/tmp\"' to your ~/.bashrc file (or set it with sbatch or in your job script).\nDefaulting to local $tmp_dir on the execution node of the Nextflow head process.\n")
// TODO: check if images stored there can be accessed by slurm jobs on other nodes
} else {
// if set, try to create the tmp directory at the specified location to avoid errors during
// docker image conversion (note that this only happens when no native singulariry/apptainer
// images are available):
// FATAL: While making image from oci registry: error fetching image to cache: while
// building SIF from layers: unable to create new build: failed to create build parent dir:
// stat /scratch/antwerpen/203/vsc20380/apptainer/tmp: no such file or directory
apptainer_tmpdir = new File(apptainer_tmpdir)
if (! apptainer_tmpdir.exists() ) {
try {
Expand All @@ -29,6 +44,16 @@ if ( System.getenv("VSC_INSTITUTE") != "antwerpen" ) {
}
}
}
// APPTAINER_CACHEDIR/SINGULARITY_CACHEDIR
if ( !apptainer_cachedir ) {
System.err.println("\nWARNING: APPTAINER_CACHEDIR/SINGULARITY_CACHEDIR environment variable was not found.\nPlease add the line 'export APPTAINER_CACHEDIR=\"\${VSC_SCRATCH}/apptainer/cache\"' to your ~/.bashrc file (or set it with sbatch or in your job script).\nUsing the default storage location of Singularity/Apptainer ~/.apptainer/cache/. Read more about why this should be avoided in the VSC docs: https://docs.vscentrum.be/software/singularity.html#building-on-vsc-infrastructure\n")
// TODO: optional exit out here.
}
// NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR
if ( !nxf_apptainer_cachedir ) {
nxf_apptainer_cachedir = "$scratch_dir/apptainer/nextflow_cache"
System.err.println("\nWARNING: NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR environment variable was not found.\nPlease add the line 'export NXF_APPTAINER_CACHEDIR=\"\${VSC_SCRATCH}/apptainer/nextflow_cache\"' to your ~/.bashrc file (or set it with sbatch or in your job script).\nDefaulting to $nxf_apptainer_cachedir instead of the nextflow work directory.\n")
}
}

// Function to check if the selected partition profile matches the partition on which the master
Expand Down Expand Up @@ -73,19 +98,18 @@ process {
}

// Specify that apptainer/singularity should be used and where the cache dir will be for the images.
// The singularity directive is used in favour of the apptainer one, because currently the apptainer
// Singularity is used in favour of apptainer, because currently the apptainer
// variant will pull in (and convert) docker images, instead of using pre-built singularity ones.
// To use the pre-built singularity containers instead, the singularity options should be selected
// with apptainer installed on the system, which defines singularity as an alias (as is the case
// on CalcUA).
// On a system where singularity is defined as an alias for apptainer (as is the case on CalcUA),
// this works out fine and results in pre-built singularity containers being downloaded.
// See https://nf-co.re/docs/usage/installation#pipeline-software
// and https://nf-co.re/tools#how-the-singularity-image-downloads-work
// See https://www.nextflow.io/docs/latest/config.html#scope-singularity
singularity {
enabled = true
autoMounts = true
// See https://www.nextflow.io/docs/latest/singularity.html#singularity-docker-hub
cacheDir = "$scratch_dir/apptainer/nextflow_cache" // Equivalent to setting NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR environment variable
cacheDir = "$nxf_apptainer_cachedir" // Equivalent to setting NXF_APPTAINER_CACHEDIR/NXF_SINGULARITY_CACHEDIR environment variable
}

// Define profiles for the following partitions:
Expand Down
10 changes: 5 additions & 5 deletions docs/vsc_calcua.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ By default, Nextflow stores all of the intermediate files required to run the pi

If the run does not complete successfully then the `work` directory should be removed manually to save storage space. The default work directory is set to `$VSC_SCRATCH/work` per this configuration. You can also use the [`nextflow clean` command](https://www.nextflow.io/docs/latest/cli.html#clean) to clean up all files related to a specific run (including not just the `work` directory, but also log files and the `.nextflow` cache directory).

> **NB:** The Nextflow `work` directory for any pipeline is located in `$VSC_SCRATCH` by default and is cleaned automatically after a success pipeline run, unless the `debug` profile is provided.
> **NB:** The Nextflow `work` directory for any pipeline is located in `$VSC_SCRATCH/work` by default and is cleaned automatically after a success pipeline run, unless the `debug` profile is provided.
### Debug mode

Expand Down Expand Up @@ -233,14 +233,14 @@ The `nextflow run ...` command that launches the head process, can be invoked ei
[Apptainer](https://apptainer.org/) is an open-source fork of [Singularity](https://sylabs.io/singularity/), which is an alternative container runtime to Docker. It is more suitable to usage on HPCs because it can be run without root privileges and does not use a dedicated daemon process. More info on the usage of Apptainer/Singularity on the VSC HPC can be found [here](https://docs.vscentrum.be/software/singularity.html).

When executing Nextflow pipelines using Apptainer/Singularity, the container image files will by default be cached inside the pipeline work directory. The CalcUA config profile instead sets the [singularity.cacheDir setting](https://www.nextflow.io/docs/latest/singularity.html#singularity-docker-hub) to a central location on your scratch space (`$VSC_SCRATCH/apptainer/nextflow_cache`), in order to reuse them between different pipelines. This is equivalent to setting the `NXF_APPTAINER_CACHEDIR`/`NXF_SINGULARITY_CACHEDIR` environment variables manually (but note that the `cacheDir` defined in the config file takes precedence and cannot be overwritten by setting the environment variable).
When executing Nextflow pipelines using Apptainer/Singularity, the container image files will by default be cached inside the pipeline work directory (which for the `vsc_calcua` config would by default be set to `$VSC_SCRATCH/work`). The CalcUA config profile instead sets the [singularity.cacheDir setting](https://www.nextflow.io/docs/latest/singularity.html#singularity-docker-hub) to a central location on your scratch space (`$VSC_SCRATCH/apptainer/nextflow_cache`), in order to reuse them between different pipelines even when cleaning the work directory. If the `NXF_APPTAINER_CACHEDIR`/`NXF_SINGULARITY_CACHEDIR` environment variables are set manually, they will take precedence over this default setting.

Apptainer/Singularity makes use of two additional environment variables, `APPTAINER_CACHEDIR`/`SINGULARITY_CACHEDIR` and `APPTAINER_TMPDIR`/`SINGULARITY_TMPDIR`. As recommended by the [VSC documentation on containers](https://docs.vscentrum.be/software/singularity.html#building-on-vsc-infrastructure), these should be set to a location on the scratch system, to avoid exceeding the quota on your home directory file system.

> **NB:** The cachedir and tmpdir are only used when new images are built or converted from existing docker images. For most nf-core pipelines this does not happen, since they will instead try to directly pull pre-built singularity images from [Galaxy Depot](https://depot.galaxyproject.org/singularity/)
- The [cache directory](https://apptainer.org/docs/user/main/build_env.html#cache-folders) `APPTAINER_CACHEDIR`/`SINGULARITY_CACHEDIR` is used to store files and layers used during image creation (or conversion of Docker/OCI images). Its default location is `$HOME/.apptainer/cache`, but it is recommended to change this to `$VSC_SCRATCH/apptainer/cache` on the CalcUA HPC instead.
- The [temporary directory](https://apptainer.org/docs/user/main/build_env.html#temporary-folders) `APPTAINER_TMPDIR`/`SINGULARITY_TMPDIR` is used to store temporary files when building an image (or converting a Docker/OCI source). The directory must have enough free space to hold the entire uncompressed image during all steps of the build process. Its default location is `/tmp`, but it is recommended to change this to `$VSC_SCRATCH/apptainer/tmp` on the CalcUA HPC instead. The reason being that the default `/tmp` would refer to a directory on the the compute node running the master nextflow process, which are [small SSDs on CalcUA](https://docs.vscentrum.be/antwerp/tier2_hardware/uantwerp_storage.html).
- The [cache directory](https://apptainer.org/docs/user/main/build_env.html#cache-folders) `APPTAINER_CACHEDIR`/`SINGULARITY_CACHEDIR` is used to store files and layers used during image creation (or conversion of Docker/OCI images). Its default location is `$HOME/.apptainer/cache`, but we recommended changing it to `$VSC_SCRATCH/apptainer/cache` (or another location in scratch) on the CalcUA HPC instead, to avoid exceeding the quota in the home file system.
- The [temporary directory](https://apptainer.org/docs/user/main/build_env.html#temporary-folders) `APPTAINER_TMPDIR`/`SINGULARITY_TMPDIR` is used to store temporary files when building an image (or converting a Docker/OCI source). The directory must have enough free space to hold the entire uncompressed image during all steps of the build process. Its default location is `/tmp` (or more accurately, `$TMPDIR` in the environment of the nextflow head process), but we recommended changing it to `$VSC_SCRATCH/apptainer/tmp` (or another location in scratch) on the CalcUA HPC instead. The reason being that the default `/tmp` would refer to a directory on the the compute node running the master nextflow process, which are [small SSDs on CalcUA](https://docs.vscentrum.be/antwerp/tier2_hardware/uantwerp_storage.html) that could get filled up.

> **NB:** The tmp directory needs to be created manually beforehand, otherwise pipelines that need to pull in and convert docker images, or the manual building of images yourself, will fail.
Expand All @@ -252,7 +252,7 @@ These two variables can be set in several different ways:
- Passed to `sbatch` as a parameter or on a `#SBATCH` line in the job script (e.g., `--export=APPTAINER_CACHEDIR=${VSC_SCRATCH}/apptainer/cache,APPTAINER_TMPDIR=${VSC_SCRATCH}/apptainer/tmp`).
- Directly in your job script (e.g., `export APPTAINER_CACHEDIR=${VSC_SCRATCH}/apptainer/cache APPTAINER_TMPDIR=${VSC_SCRATCH}/apptainer/tmp`).

However, note that for the `.bashrc` option to work, the environment need to be passed on to the slurm jobs. Currently, this seems to happen by default (i.e., variables defined in `~/.bashrc` are propagated), but there exist methods to enforce this more strictly. E.g., job scripts that start with `#!/bin/bash -l`, will ensure that jobs [launch using your login environment](https://docs.vscentrum.be/leuven/slurm_specifics.html#job-shell). Similarly, the `sbatch` options `[--get-user-env`](https://slurm.schedmd.com/sbatch.html#OPT_get-user-env) or [`--export=`](https://slurm.schedmd.com/sbatch.html#OPT_export) can be used. Also [see the CalcUA-specific](https://docs.vscentrum.be/jobs/slurm_pbs_comparison.html#main-differences-between-slurm-and-torque) and the [general VSC documentation for more info](https://docs.vscentrum.be/jobs/job_submission.html#the-job-environment).
However, note that for the `.bashrc` option to work, the environment need to be passed on to the slurm jobs. Currently, this seems to happen by default (i.e., variables defined in `~/.bashrc` are propagated, as per [the VSC docs](https://docs.vscentrum.be/leuven/slurm_specifics.html#environment-propagation)), but there also exist methods to enforce this more strictly. E.g., job scripts that start with `#!/bin/bash -l`, will ensure that jobs [launch using your login environment](https://docs.vscentrum.be/leuven/slurm_specifics.html#job-shell). Similarly, the `sbatch` options [`--get-user-env`](https://slurm.schedmd.com/sbatch.html#OPT_get-user-env) or [`--export=`](https://slurm.schedmd.com/sbatch.html#OPT_export) can be used. Also [see the CalcUA-specific](https://docs.vscentrum.be/jobs/slurm_pbs_comparison.html#main-differences-between-slurm-and-torque) and the [general VSC documentation for more info](https://docs.vscentrum.be/jobs/job_submission.html#the-job-environment).

Lastly, note that this config file currently uses the Singularity engine rather than the Apptainer one (see [`singularity` directive: `enabled = true`](https://www.nextflow.io/docs/latest/config.html#scope-singularity)). The reason is that, for the time being, using the apptainer engine in nf-core pipelines will result in docker images being pulled and converted to apptainer ones, rather than making use of pre-built singularity images (see [nf-core documentation](https://nf-co.re/docs/usage/installation#pipeline-software)). Conversely, when making use of the singularity engine, pre-built images are downloaded and Apptainer will still be used in the background for running these, since the installation of `apptainer` will by default create an alias for `singularity` (and this is also the case on CalcUA).

Expand Down

0 comments on commit 915bb0f

Please sign in to comment.