Commands and Usage

connecting to the cluster

ssh slurm-login

srun

Use srun to test a job from your shell.

example:

srun -p <partition> -q <qos> -t <time> --gpus-per-node <num> python my_job.py

Common options used with srun, sbatch, or salloc are explained below.

Jobs submitted with srun are tied to your login shell. For long-running jobs, or to ensure your job will not exit when your connection drops, use sbatch instead.

sbatch

Write a script with the commands that should be run on a compute node, and use sbatch to submit the script. It will be run when the requested resources are allocated.

example usage:

sbatch my_batch_script.sbatch

Common options used with srun, sbatch, or salloc are explained further below.

example sbatch script:

Save this as a file (typically with the .sbatch extension) and run it with sbatch:)

#!/bin/bash
#SBATCH --job-name=my_very_own_job 
#SBATCH --partition=<partition>
#SBATCH --qos=<qos>
#SBATCH --time=05:00:00 # 5 hours
#SBATCH --output=/data/scratch/$USER/job_output_%j.log
#SBATCH --error=/data/scratch/$USER/job_error_%j.log
#SBATCH --gpus=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4000M

load_my_dependencies.py

date
echo "Running my job on $HOSTNAME"

# put the job and data on the local node for better performance
sbcast /data/scratch/$USER/my_training_data /tmp/my_training_data
sbcast /data/scratch/$USER/my_training_job.py /tmp/my_training_job.py

python /tmp/my_training_job.py

sbcast

Use sbcast to send a file to the allocated node.

example:

sbcast <local_filename> <remote_filename>

The default destination directory for sbcast is /tmp.

If you are using fast shared storage, you may have better performance using ‘cp’ within your sbatch script, rather than relaying the file through the submit node with sbcast.

salloc

Use salloc to obtain an allocation, then run jobs interactively.

example salloc session:

salloc -p tig -q tig-main -t 10
# wait for the allocation to be granted
srun hostname
srun date
echo "sending a file"
sbcast my_job.py my_job.py
srun python /tmp/my_job.py
exit

Common options used with srun, sbatch, or salloc are explained below.

The shell obtained by salloc does not change your PROMPT, so the difference between it and your normal shell is not visually obvious. When you are done, or when your allocation period ends, you will need to type exit to return to your regular Linux shell.

sinfo

Use sinfo to see the status of available systems and partitions.

examples:

sinfo

sinfo -p <partition>

squeue

use squeue to list waiting or running jobs.

example:

squeue

scontrol

use scontrol to see detailed information about a node or job

examples:

scontrol show node

scontrol show job

scancel

use scancel to end your job early

example:

scancel <job_id>

options used with sbatch, srun, or salloc

required options

--partition / -p <partition> Required.

All lab members have access to partition tig. You may have access to one or more additional partitions depending on your group.

--qos / -q <qos> Required.

The QoS you specify must be allowed for the selected partition. The partition tig allows tig-main or tig-free-cycles

Your choice of partition determines your job’s preemption level, as well as the quantity of resources you are allowed to request at one time. QoS tig-main will have lower resource limits, but can preempt jobs in tig-free-cycles.

This allows you to freely use any resources that are available, while allowing smaller jobs to run more reliably.

If you use a free-cycles partition, consider specifying a shorter --time to improve your chances of the job completing.

--time / -t <time> Required.

Formats accepted: D-HH:MM, D-HH:MM:SS, D-HH, HH:MM:SS, MM:SS, or MM

general options

--output / -o <filepath> filepath where the job’s output (STDOUT) will be written.

--error / -e <filepath> filepath where the job’s errors (STDERR) will be written.

–output and –error should be set to a shared location (e.g. /data), else the file will be present on the node where it was run.

--requeue

Used when submitting a job on a preemptible QoS (e.g. tig-free-cycles). If preempted, a job will be killed. If scheduled with ‘–requeue’ it will then will be requeued and eventually run again from the start. (sbatch only)

--pty

Obtain a pseudo-terminal. Use with srun to get a login shell on the remote host, as in srun <options> --pty /bin/bash. (srun only)

options for requesting resources

--cpus-per-task / -c <count> Number of CPUs to be allocated per task. Default is 1. Not compatible with --cpus-per-gpu.

--cpus-per-gpu <count> Number of CPUs to be allocated per GPU. Not compatible with --cpus-per-task.

--mem <size><unit> Amount of memory to be allocated per node. Default is 1000M, which is deliberately low. Users are encouraged to estimate and request the amount they need. To request all memory on a node, set –mem to 0.

--gpus-per-node <count> Number of GPUs to be assigned on each allocated node. Default is 0.

--nodes / -N <count> Run a job on a specified number of nodes. Default is 1.

--nodelist / -w <list of nodes> A comma-separated list of specific nodes where the job will run.

--exclude / -x <list of nodes> A comma-separated list of specific nodes where the job will not run.

options for multiple tasks

--ntasks / -n Number of tasks that may be created within this allocation. Default is 1.

To fork a task, use srun <command> & within an sbatch script or salloc shell. this use of srun will inherit (some) options from the surrounding sbatch or salloc. The forked task will receive an ID like <job_id>.<task_id>

--cpus-per-task / -c <count> Number of cpus to be allocated per task. Default is 1.

--gpus-per-task / -G <count> Number of GPUs to be allocated per task. Cannot be used with –gpus-per-node.

--array Submit a number of related tasks. Indices may be a range like --array 1-5, or may be steps like --array 2,3,5,7. Each step of the array will be an individual task, and that task can read its own step ID via the environment variable SLURM_ARRAY_TASK_ID .