Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for HTCondor pools without shared filesystems #14

Open
jhiemstrawisc opened this issue Apr 17, 2024 · 0 comments
Open

Adding support for HTCondor pools without shared filesystems #14

jhiemstrawisc opened this issue Apr 17, 2024 · 0 comments

Comments

@jhiemstrawisc
Copy link

Many HTCondor pools don't have a shared filesystem, but this plugin relies on one.

I started looking into how the plugin constructs its submit file, and I suspect a good starting point toward relaxing its dependence on a shared filesystem would be to remove a few absolute paths and rely on HTCondor's file transfer mechanism to make these files available in each job's scratch directory on the execution point.

In particular, I noticed several filepaths in the construction of the submit file's arguments parameter:

-m snakemake --snakefile /access/point/path/to/Snakefile --target-jobs 'log_parameters:algorithm=omicsintegrator1,params=params-PU62FNV' --allowed-rules 'log_parameters' --cores 1 --attempt 1 --force-use-threads  --wait-for-files '/access/point/path/to/.snakemake/tmp.nej38zse' --force --target-files-omit-workdir-adjustment --keep-storage-local-copies --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --rerun-triggers input software-env code mtime params --conda-frontend mamba --shared-fs-usage sources software-deployment persistence storage-local-copies input-output source-cache --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --configfiles /access/point/path/to/example_config.yaml --latency-wait 5 --scheduler ilp --local-storage-prefix .snakemake/storage --scheduler-solver-path /access/point/path/to/miniconda3/envs/spras/bin --default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI= --mode remote

I don't know snakemake well enough to know what all of these arguments are doing, but things like the Snakefile and the config.yml could be provided to each job at the EP by modifying the submit_dict to contain something like:

executable = python
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = /access/point/path/to/Snakefile, /access/point/path/to/input, /access/point/path/to/example_config.yml, ...
arguments = "-m snakemake --snakefile Snakefile ... --configfiles example_config.yml ... "

When HTCondor transfers these files to the execution point, it will flatten them into the job's scratch directory.

What other blockers are there to making something like this work? Perhaps one route to consider is making an HTCondor storage provider plugin that sets some of this up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant