SEML - Slurm Experiment Management Library.
Usage:
$ seml [OPTIONS] COLLECTION COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Arguments:
COLLECTION
: The name of the database collection to use. [required]
Options:
--migration-skip
: Skip the migration of the database collection.--migration-backup
: Backup the database collection before migration.-v, --verbose
: Whether to print debug messages.-V, --version
: Print the version number.--install-completion
: Install completion for the current shell.--show-completion
: Show completion for the current shell, to copy it or customize the installation.--help
: Show this message and exit.
Commands:
add
: Add experiments to the database as defined...cancel
: Cancel the Slurm job/job step...claim-experiment
: Claim an experiment from the database.clean-db
: Remove orphaned artifacts in the DB from...clean-jobs
: Cancel empty pending jobs.configure
: Configure SEML (database, argument...delete
: Delete experiments by ID or state (cancels...description
: Manage descriptions of the experiments in...detect-duplicates
: Prints duplicate experiment configurations.detect-killed
: Detect experiments where the corresponding...download-sources
: Download source files from the database to...drop
: Drop collections from the database.hold
: Hold queued experiments via SLURM.launch-worker
: Launch a local worker that runs PENDING jobs.list
: Lists all collections in the database.prepare-experiment
: Fetch experiment from database, prepare it...print-command
: Print the commands that would be executed...print-experiment
: Print the experiment document.print-fail-trace
: Prints fail traces of all failed experiments.print-output
: Print the output of experiments.project
: Setting up new projects.queue
: Prints the collections of the given job IDs.release
: Release held experiments via SLURM.reload-sources
: Reload stashed source files.reset
: Reset the state of experiments by setting...start
: Fetch staged experiments from the database...start-jupyter
: Start a Jupyter slurm job.status
: Report status of experiments in the...update-working-dir
: Change the working directory of...
Add experiments to the database as defined in the configuration.
Usage:
$ seml add [OPTIONS] CONFIG_FILES...
Arguments:
CONFIG_FILES...
: Path to the YAML configuration file for the experiment. [required]
Options:
-nh, --no-hash
: By default, we use the hash of the config dictionary to filter out duplicates (by comparing all dictionary values individually). Only disable this if you have a good reason as it is faster.-ncs, --no-sanity-check
: Disable this if the check fails unexpectedly when using advanced Sacred features or to accelerate adding.-ncc, --no-code-checkpoint
: Disable this if you want your experiments to use the current codeinstead of the code at the time of adding.-f, --force
: Force adding the experiment even if it already exists in the database.-o, --overwrite-params DICT
: Dictionary (passed as a string, e.g. '{"epochs": 100}') to overwrite parameters in the config.-d, --description TEXT
: A description for the experiment.--no-resolve-descriptions
: Whether to prevent using omegaconf to resolve experiment descriptions--help
: Show this message and exit.
Cancel the Slurm job/job step corresponding to experiments, filtered by ID or state.
Usage:
$ seml cancel [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, RUNNING]-w, --wait
: Wait until all jobs are properly cancelled.-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Claim an experiment from the database.
Usage:
$ seml claim-experiment [OPTIONS] SACRED_IDS...
Arguments:
SACRED_IDS...
: Sacred IDs (_id in the database collection) of the experiments to claim. [required]
Options:
--help
: Show this message and exit.
Remove orphaned artifacts in the DB from runs which have been deleted..
Usage:
$ seml clean-db [OPTIONS]
Options:
-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Cancel empty pending jobs.
Usage:
$ seml clean-jobs [OPTIONS] SACRED_IDS...
Arguments:
SACRED_IDS...
: Sacred IDs (_id in the database collection) of the experiments to claim. [required]
Options:
--help
: Show this message and exit.
Configure SEML (database, argument completion, ...).
Usage:
$ seml configure [OPTIONS]
Options:
--host TEXT
: The host of the MongoDB server.--port INTEGER
: The port of the MongoDB server.--database TEXT
: The name of the MongoDB database to use.--username TEXT
: The username for the MongoDB server.--password TEXT
: The password for the MongoDB server.-sf, --ssh-forward
: Configure SSH forwarding settings for MongoDB.--help
: Show this message and exit.
Delete experiments by ID or state (cancels Slurm jobs first if not --no-cancel).
Usage:
$ seml delete [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED, FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-nc, --no-cancel
: Do not cancel the experiments before deleting them.-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Manage descriptions of the experiments in a collection.
Usage:
$ seml description [OPTIONS] COMMAND [ARGS]...
Options:
--help
: Show this message and exit.
Commands:
delete
: Deletes the description of experiment(s).list
: Lists the descriptions of all experiments.set
: Sets the description of experiment(s).
Deletes the description of experiment(s).
Usage:
$ seml description delete [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Lists the descriptions of all experiments.
Usage:
$ seml description list [OPTIONS]
Options:
-u, --update-status
: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary.--help
: Show this message and exit.
Sets the description of experiment(s).
Usage:
$ seml description set [OPTIONS] DESCRIPTION
Arguments:
DESCRIPTION
: The description to set. [required]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes
: Automatically confirm all dialogues with yes.--no-resolve-descriptions
: Whether to prevent using omegaconf to resolve experiment descriptions--help
: Show this message and exit.
Prints duplicate experiment configurations.
Usage:
$ seml detect-duplicates [OPTIONS]
Options:
-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED, FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.
Detect experiments where the corresponding Slurm jobs were killed externally.
Usage:
$ seml detect-killed [OPTIONS]
Options:
--help
: Show this message and exit.
Download source files from the database to the provided path.
Usage:
$ seml download-sources [OPTIONS] TARGET_DIRECTORY
Arguments:
TARGET_DIRECTORY
: The directory where the source files should be restored. [required]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.
Drop collections from the database.
Note: This is a dangerous operation and should only be used if you know what you are doing.
Usage:
$ seml drop [OPTIONS] [PATTERN]
Arguments:
[PATTERN]
: A regex that must match the collections to print. [default: .*]
Options:
-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Hold queued experiments via SLURM.
Usage:
$ seml hold [OPTIONS]
Options:
-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.
Launch a local worker that runs PENDING jobs.
Usage:
$ seml launch-worker [OPTIONS]
Options:
-n, --num-experiments INTEGER
: Number of experiments to start. 0: all (staged) experiments [default: 0]-nf, --no-file-output
: Do not write the experiment's output to a file.-ss, --steal-slurm
: Local jobs 'steal' from the Slurm queue, i.e. also execute experiments waiting for execution via Slurm.-pm, --post-mortem
: Activate post-mortem debugging with pdb.-d, --debug
: Run a single interactive experiment without Sacred observers and with post-mortem debugging. Implies--verbose --num-exps 1 --post-mortem --output-to-console
.-ds, --debug-server
: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug
.-o, --output-to-console
: Write the experiment's output to the console.-wg, --worker-gpus TEXT
: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER
: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT
: Further environment variables to be set for the local worker.-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.
Lists all collections in the database.
Usage:
$ seml list [OPTIONS] [PATTERN]
Arguments:
[PATTERN]
: A regex that must match the collections to print. [default: .*]
Options:
-p, --progress
: Whether to print a progress bar for iterating over collections.-u, --update-status
: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary.-fd, --full-descriptions
: Whether to print full descriptions (possibly with line breaks).--help
: Show this message and exit.
Fetch experiment from database, prepare it and print the command to execute it.
Usage:
$ seml prepare-experiment [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters. [required]-v, --verbose
: Whether to print debug messages.-u, --unobserved
: Run the experiments without Sacred observers.-pm, --post-mortem
: Activate post-mortem debugging with pdb.-ssd, --stored-sources-dir TEXT
: Load source files into this directory before starting.-ds, --debug-server
: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug
.--help
: Show this message and exit.
Print the commands that would be executed by start
.
Usage:
$ seml print-command [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: STAGED, QUEUED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-n, --num-experiments INTEGER
: Number of experiments to start. 0: all (staged) experiments [default: 0]-wg, --worker-gpus TEXT
: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER
: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT
: Further environment variables to be set for the local worker.--unresolved
: Whether to print the unresolved command.--no-interpolation
: Whether disable variable interpolation. Only compatible with --unresolved.--help
: Show this message and exit.
Print the experiment document.
Usage:
$ seml print-experiment [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, STAGED, QUEUED, RUNNING, FAILED, KILLED, INTERRUPTED, COMPLETED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-p, --projection KEY
: List of configuration keys, e.g.,config.model
, to additionally print.-F, --format TEXT
: The format in which to print the experiment document. [default: yaml]--help
: Show this message and exit.
Prints fail traces of all failed experiments.
Usage:
$ seml print-fail-trace [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: FAILED, KILLED, INTERRUPTED]-p, --projection KEY
: List of configuration keys, e.g.,config.model
, to additionally print.--help
: Show this message and exit.
Print the output of experiments.
Usage:
$ seml print-output [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: RUNNING, FAILED, KILLED, INTERRUPTED, COMPLETED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-sl, --slurm
: Whether to print the Slurm output instead of the experiment output.-h, --head INTEGER
: Print the first n lines of the output.-t, --tail INTEGER
: Print the last n lines of the output.--help
: Show this message and exit.
Setting up new projects.
Usage:
$ seml project [OPTIONS] COMMAND [ARGS]...
Options:
--help
: Show this message and exit.
Commands:
init
: Initialize a new project in the given...list-templates
: List available project templates.
Initialize a new project in the given directory.
Usage:
$ seml project init [OPTIONS] [DIRECTORY]
Arguments:
[DIRECTORY]
: The directory in which to initialize the project. [default: .]
Options:
-t, --template TEXT
: The template to use for the project. To view available templates useseml project list-templates
. [default: default]-n, --name TEXT
: The name of the project. (By default inferred from the directory name.)-u, --username TEXT
: The author name to use for the project. (By default inferred from $USER)-m, --usermail TEXT
: The author email to use for the project. (By default empty.)-r, --git-remote TEXT
: The git remote to use for the project. (By default SETTINGS.TEMPLATE_REMOTE.)-c, --git-commit TEXT
: The exact git commit to use. May also be a tag or branch (By default latest)-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
List available project templates.
Usage:
$ seml project list-templates [OPTIONS]
Options:
-r, --git-remote TEXT
: The git remote to use for the project. (By default SETTINGS.TEMPLATE_REMOTE.)-c, --git-commit TEXT
: The exact git commit to use. May also be a tag or branch (By default latest)--help
: Show this message and exit.
Prints the collections of the given job IDs. If none is specified, all jobs are considered.
Usage:
$ seml queue [OPTIONS] [JOB_IDS]...
Arguments:
[JOB_IDS]...
: The job IDs of the experiments to get the collection for.
Options:
-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: PENDING, RUNNING]-a, --all
: Whether to attempt finding the collection of the jobs of all users.-w, --watch
: Whether to watch the queue.--help
: Show this message and exit.
Release held experiments via SLURM.
Usage:
$ seml release [OPTIONS]
Options:
-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.
Reload stashed source files.
Usage:
$ seml reload-sources [OPTIONS]
Options:
-k, -keep-old
: Keep the old source files in the database.-b, --batch-ids INTEGER
: Batch IDs (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Reset the state of experiments by setting their state to STAGED and cleaning their database entry. Does not cancel Slurm jobs.
Usage:
$ seml reset [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-s, --filter-states [STAGED|QUEUED|PENDING|RUNNING|FAILED|KILLED|INTERRUPTED|COMPLETED]
: List of states to filter the experiments by. If empty (""), all states are considered. [default: FAILED, KILLED, INTERRUPTED]-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-y, --yes
: Automatically confirm all dialogues with yes.--help
: Show this message and exit.
Fetch staged experiments from the database and run them (by default via Slurm).
Usage:
$ seml start [OPTIONS]
Options:
-id, --sacred-id INTEGER
: Sacred ID (_id in the database collection) of the experiment. Takes precedence over other filters.-fd, --filter-dict DICT
: Dictionary (passed as a string, e.g. '{"config.dataset": "cora_ml"}') to filter the experiments by.-b, --batch-id INTEGER
: Batch ID (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.-d, --debug
: Run a single interactive experiment without Sacred observers and with post-mortem debugging. Implies--verbose --num-exps 1 --post-mortem --output-to-console
.-ds, --debug-server
: Run the experiment with a debug server, to which you can remotely connect with e.g. VS Code. Implies--debug
.-l, --local
: Run the experiment locally instead of on a Slurm cluster.-nw, --no-worker
: Do not launch a local worker after setting experiments' state to PENDING.-n, --num-experiments INTEGER
: Number of experiments to start. 0: all (staged) experiments [default: 0]-nf, --no-file-output
: Do not write the experiment's output to a file.-ss, --steal-slurm
: Local jobs 'steal' from the Slurm queue, i.e. also execute experiments waiting for execution via Slurm.-pm, --post-mortem
: Activate post-mortem debugging with pdb.-o, --output-to-console
: Write the experiment's output to the console.-wg, --worker-gpus TEXT
: The IDs of the GPUs used by the local worker. Will be directly passed to CUDA_VISIBLE_DEVICES.-wc, --worker-cpus INTEGER
: The number of CPUs used by the local worker. Will be directly passed to OMP_NUM_THREADS.-we, --worker-env DICT
: Further environment variables to be set for the local worker.--help
: Show this message and exit.
Start a Jupyter slurm job. Uses SBATCH options defined in settings.py under SBATCH_OPTIONS_TEMPLATES.JUPYTER
Usage:
$ seml start-jupyter [OPTIONS]
Options:
-l, --lab
: Start a jupyter-lab instance instead of jupyter notebook.-c, --conda-env TEXT
: Start the Jupyter instance in a Conda environment.-sb, --sbatch-options DICT
: Dictionary (passed as a string, e.g. '{"gres": "gpu:2"}') to request two GPUs.--help
: Show this message and exit.
Report status of experiments in the database collection.
Usage:
$ seml status [OPTIONS]
Options:
-u, --update-status
: Whether to update the status of experiments in the database. This can take a while for large collections. Use only if necessary. [default: True]-p, --projection KEY
: List of configuration keys, e.g.,config.model
, to additionally print.--help
: Show this message and exit.
Change the working directory of experiments in case you moved the source code to a different location.
Usage:
$ seml update-working-dir [OPTIONS] WORKING_DIR
Arguments:
WORKING_DIR
: The new working directory for the experiments. [required]
Options:
-b, --batch-ids INTEGER
: Batch IDs (batch_id in the database collection) of the experiments. Experiments that were staged together have the same batch_id.--help
: Show this message and exit.