-
Notifications
You must be signed in to change notification settings - Fork 13
Running THOR on slurm using the slurm batch script
To run THOR on slurm without creating batch files all the time, there is a python script that can submit multiple consecutive jobs.
First create its config file in ~/.config/THOR-gcm/slurm.cfg
containing this:
# [DEFAULTS]
# # working directory where slurm is run from
# working_dir = /path/to/thor/directory/
# # email address to send slurm report to
# user_email = [email protected]
# # where to log data in
# log_dir = /path/to/log/dir
# # slurm ressource request
# gpu_key = gpu:gtx1080ti:1
# # slurm partition
# partition = gpu
Replace the values with what you need.
Then, you can launch a job with:
$ python tools/slurm_batch_run.py -jn <job name> -n <number of jobs> -o <output directory> <config file>
: how it appears in the emails and jobs from the job queue. : after a time limit, jobs get stopped on slurm (23:00, in our case). To run longer jobs, you need to launch multiple consecutive jobs, so that THOR can resume it's work. : where the data gets writen. : the configuration file for the THOR simulation.
For example:
$ python tools/slurm_batch_run.py -jn alf_wasp -n 3 -o ../thor-data/alf_wasp ifile/alf_wasp_alf.thr
You can display the state of your job queue with
$ squeue -u <username>
Or if you want some more info and more space to see the full job name:
$ squeue -u <username> -o "%.18i %.9P %40j %.8u %.2t %.10M %.6D %R %C %e %E %X"