More On Slurm Jobs & migration

Job Runtime Environment in Slurm

When a job is submitted, Slurm will store all environment variables in-place at submission time and replicate that environment on the first allocated node where the batch script actually runs.

Additionally, at run time, Slurm will set a number of shell environment variables that relate to the job itself and can be used in the job run. The Slurm documentation’s manpage on sbatch provides an exhaustive guide, but we highlight some useful ones here.

#SBATCH Directives

In line with most batch schedulers, Slurm uses directives in submission scripts to specify job requirements and parameters for a job – the #SBATCH directives. Thus for an MPI task we might typically have:

#SBATCH -p compute
#SBATCH -o runout.%J
#SBATCH -e runerr.%J
#SBATCH --job-name=mpijob
#SBATCH -n 80
#SBATCH --tasks-per-node=40
#SBATCH --exclusive
#SBATCH -t 0-12:00
#SBATCH --mem-per-cpu=4000

Walking through these:

Slurm #SBATCH directive	Description
#SBATCH --partition=compute or #SBATCH -p compute	In Slurm, jobs are submitted to 'partitions'. Despite the naming difference, the concept is the same.
#SBATCH --output=runout.%J or #SBATCH -o runout.%J	File for STDOUT from the job run to be stored in. The '%J' to Slurm is replaced with the job number.
#SBATCH --error=runerr.%J or #SBATCH -e runerr.%J	File for STDERR from the job run to be stored in. The '%J' to Slurm is replaced with the job number.
#SBATCH --job-name=mpijob	Job name, useful for monitoring and setting up inter-job dependency.
#SBATCH --ntasks=128 or #SBATCH -n 128	Number of processors required for job.
#SBATCH --tasks-per-node=16	The number of processors (tasks) to run per node.
#SBATCH --exclusive	Exclusive job allocation - i.e. no other users on allocated nodes.
#SBATCH --time=0-12:00 or #SBATCH -t 0-12:00	Maximum runtime of job. Note that it is beneficial to specify this and not leave it at the maximum as it will improve the chances of the scheduler 'back-filling' the job and running it earlier.
#SBATCH --mem-per-cpu=4000	Memory requirements of job. Slurm's memory-based scheduling is more powerful than many schedulers.

Environment Variables

Once an allocation has been scheduled and a job script is started (on the first node of the allocation), Slurm sets a number of shell environment variables that can be used in the script at runtime. Below is a summary of some of the most useful:

Slurm	Description
$SLURM_JOBID	Job ID.
$SLURM_JOB_NODELIST	Nodes allocated to the job i.e. with at least once task on.
$SLURM_ARRAY_TASK_ID	If an array job, then the task index.
$SLURM_JOB_NAME	Job name.
$SLURM_JOB_PARTITION	Partition that the job was submitted to.
$SLURM_JOB_NUM_NODES	Number of nodes allocated to this job.
$SLURM_NTASKS	Number of tasks (processes) allocated to this job.
$SLURM_NTASKS_PER_NODE (Only set if the --ntasks-per-node option is specified)	Number of tasks (processes) per node.
$SLURM_SUBMIT_DIR	Directory in which job was submitted.
$SLURM_SUBMIT_HOST	Host on which job was submitted.
$SLURM_PROC_ID	The process (task) ID within the job. This will start from zero and go up to $SLURM_NTASKS-1.

System Queues & Partitions in Slurm

Please use the sinfo command to see the names of the partitions (queues) to use in your job scripts. If not specified, the default partition will be used for job submissions. sinfo -s will give a more succinct partition list. Please see the Hawk and Sunbird pages for a list of partitions and their descriptions on each system.