Skip to content

Running THOR on slurm using the slurm batch script

nabajour edited this page Jul 20, 2020 · 2 revisions

To run THOR on slurm without creating batch files all the time, there is a python script that can submit multiple consecutive jobs.

First create its config file in ~/.config/THOR-gcm/slurm.cfg containing this:

# [DEFAULTS]
# # working directory where slurm is run from
# working_dir = /path/to/thor/directory/
# # email address to send slurm report to
# user_email = [email protected]
# # where to log data in
# log_dir = /path/to/log/dir
# # slurm ressource request
# gpu_key = gpu:gtx1080ti:1
# # slurm partition
# partition = gpu

Replace the values with what you need.

Then, you can launch a job with:

$ python tools/slurm_batch_run.py -jn <job name> -n <number of jobs> -o <output directory> <config file>

: how it appears in the emails and jobs from the job queue. : after a time limit, jobs get stopped on slurm (23:00, in our case). To run longer jobs, you need to launch multiple consecutive jobs, so that THOR can resume it's work. : where the data gets writen. : the configuration file for the THOR simulation.

For example:

$ python tools/slurm_batch_run.py -jn alf_wasp -n 3 -o ../thor-data/alf_wasp ifile/alf_wasp_alf.thr

Slurm display jobs queue

You can display the state of your job queue with

$ squeue -u <username>

Or if you want some more info and more space to see the full job name: $ squeue -u <username> -o "%.18i %.9P %40j %.8u %.2t %.10M %.6D %R %C %e %E %X"