Slurm Resources

Basic Resources

You will need to specify the following for every job:

Time

Specify your job’s maximum run time with the --time= option. Available formats include D-HH:MM:SS, MM:SS, D-HH, and HH:MM:SS. The maximum allowed time varies by partition.

To have your job queued more quickly, consider a lower time limit.

Memory

Your job will be limited to the allocated amount of memory. In the simplest case, use --mem in MB, or specify the unit like --mem=8G. You can also specify relative memory values, with --mem-per-cpu or --mem-per-gpu

CPU

CPU requirements are expressed in number of cores. The simple case is --cpus-per-task, where the default number of tasks is 1 but more can be specified with –ntasks.

Additional Resource Options

GPU

Request a number of GPUs for your job with --gpus=.

GPU memory / VRAM is not currently managed by Slurm. To guarantee that a minimum amount of GPU memory is available, choose a GPU type with the --constraint= option (see below).

Node Features

A “feature” in Slurm is an arbitrary label that we use to specify characteristics of a node. To submit a job with a feature requirement, you can use the --constraint flag. A comma-separated list requires all listed features, a list delimited by | requires any of the listed features, and a ! before a feature requires that feature to be absent.

Datacenter

All nodes have either ‘stata’ or ‘holyoke’. tip: if you are submitting to a shared partition, use --constraint stata or --constraint holyoke to guarantee your node is in the same location as your data for best performance

GPU Type

All GPU nodes have a feature corresponding to the GPU type. The currently available list of GPU features is:

geforce_gtx_980
nvidia_a100_80gb_pcie
nvidia_a100-sxm4-80gb
nvidia_geforce_gtx_1080_ti
nvidia_geforce_gtx_780
nvidia_geforce_rtx_2080_ti
nvidia_geforce_rtx_3090
nvidia_geforce_rtx_4090
nvidia_h100_80gb_hbm3
nvidia_h100_nvl
nvidia_rtx_a6000
nvidia_titan_rtx
nvidia_titan_xp
quadro_k620
tesla_v100-sxm2-32gb

Viewing Available Resources

sinfo

Displays available partitions and shows which nodes are idle, in use, or down.

squeue

Displays jobs waiting in the queue.

scontrol show node $NODE_NAME

Displays information about a node, including total resources, allocated resources, and status.

scontrol show partition $PARTITION_NAME

Displays information about a partition, including eligible QoS levels and maximum allowed runtime.

scontrol show job $JOB_ID

Displays information about your running job.