Converting Moab/Torque scripts to Slurm

On May 1st, 2019 the Quest scheduler will change from Moab to Slurm. Researchers using Quest will have to update their job submission scripts and use different commands to run under the new scheduler.

Click here for the schedule of Slurm workshops offered during the transition period. For more information, please see the project page.

Below are Moab/Torque commands and flags and their Slurm equivalents.

Job Submission Options

Option Moab/Torque (msub) Slurm (sbatch)
Script directive #MSUB #SBATCH
Job name -N <name> --job-name=<name>
-J <name>
Account -A <account> --account=<account>
-A <account>
Queue -q <queue> --partition=<queue>
Wall time limit -l walltime=<hh:mm:ss> --time=<hh:mm:ss>
-t 
<hh:mm:ss>
Node count -l nodes=<count> --nodes=<count>
-N <count>
Core count -l procs=<count> -n <count>
Process count per node -l ppn=<count> --ntasks-per-node=<count>
Core count (per process)
--cpus-per-task=<cores>
Memory limit -l mem=<limit> --mem=<limit> (Memory per node in MB)
Minimum memory per processor -l pmem=<limit> --mem-per-cpu=<memory>
Request GPUs -l gpus=<count> --gres=gpu:<count>
Request specific nodes -l nodes=<node>[,node2[,...]]> -w, --nodelist=<node>[,node2[,...]]>
-F, --nodefile=<node file>
Job array -t <array indices> -a <array indices>
Standard output file -o <file path> --output=<file path> (path must exist)
Standard error file -e <file path> --error=<file path> (path must exist)
Combine stdout/stderr to stdout -j oe --output=<combined out and err file path>
Architecture constraint -l partition=<architecture> --constraint=<architecture>
-C <architecture>
Copy environment -V --export=ALL (default)
--export=NONE to not export environment
Copy environment variable -v <variable[=value][,variable2=value2[,...]]> --export=<variable[=value][,variable2=value2[,...]]>
Job dependency -W depend=after:jobID[:jobID...]
-W depend=afterok:jobID[:jobID...]
-W depend=afternotok:jobID[:jobID...]
-W depend=afterany:jobID[:jobID...]
--dependency=after:jobID[:jobID...]
--dependency=afterok:jobID[:jobID...]
--dependency=afternotok:jobID[:jobID...]
--dependency=afterany:jobID[:jobID...]
Request event notification -m <events> --mail-type=<events>
Note: multiple mail-type requests may be specified in a comma separated list:
--mail-type=BEGIN,END,NONE,FAIL,REQUEUE
Email address -M <email address> --mail-user=<email address>
Defer job until the specified time -a <date/time> --begin=<date/time>
Node exclusive job -l naccesspolicy=singlejob --exclusive

Common Job Commands

Option Moab/Torque (msub) Slurm (sbatch)
Submit a job msub <job script> sbatch <job script>
Delete a job qdel <job ID>
canceljob <job ID>
scancel <job ID>
Job status (all) qstat
showq
squeue
Job status (by job) qstat <job ID> squeue -j <job ID>
Job status (by user) qstat -u <netID>
showq -u <netID>
squeue -u <netID>
Job status (detailed) qstat -f <job ID>
checkjob <job ID>
scontrol show job -dd <job ID>
checkjob <job ID>
Show expected start time showstart <job ID> squeue -j <job ID> --start
Queue list / info qstat -q [queue] scontrol show partition [queue]
Hold a job qhold <job ID> scontrol hold <job ID>
Release a job qrls <job ID> scontrol release <job ID>
Start an interactive job msub -I <args> salloc <args>
srun --pty <args>
X forwarding msub -l -X <args> srun --pty <args> --x11
Monitor or review a job's resource usage
sacct -j <job_num> --format JobID,jobname,NTasks,nodelist,CPUTime,ReqMem,Elapsed
(see sacct for all format options)
View job batch script
scontrol write batch_script <jobID> [filename]

Script Variables

Info Torque Slurm Notes
Version $PBS_VERSION Can extract from sbatch --version
Job name $PBS_JOBNAME $SLURM_JOB_NAME
Job ID $PBS_JOBID $SLURM_JOB_ID
Batch or interactive $PBS_ENVIRONMENT
Submit directory $PBS_O_WORKDIR $SLURM_SUBMIT_DIR Slurm jobs starts from the submit directory by default
Node file $PBS_NODEFILE
A filename and path that lists the nodes a job has been allocated
Node list cat $PBS_NODEFILE $SLURM_JOB_NODELIST The Slurm variable has a different format to the PBS one.
To get a list of nodes use:
scontrol show hostnames $SLURM_JOB_NODELIST
Job array index $PBS_ARRAY_INDEX $SLURM_ARRAY_TASK_ID
Queue name $PBS_QUEUE $SLURM_JOB_PARTITION
Number of nodes allocated $PBS_NUM_NODES $SLURM_JOB_NUM_NODES
$SLURM_NNODES

Number of processes $PBS_NP $SLURM_NTASKS
Number of processes per node $PBS_NUM_PPN $SLURM_TASKS_PER_NODE
Requested tasks per node $SLURM_NTASKS_PER_NODE
Requested CPUs per task $SLURM_CPUS_PER_TASK
Scheduling priority $SLURM_PRIO_PROCESS
Job user $SLURM_JOB_USER
Hostname $HOSTNAME $HOSTNAME == $SLURM_SUBMIT_HOST Unless a shell is invoked on an allocated resource, the $HOSTNAME variable is propagated (copied) from the submit machine environment to all allocated nodes.

Adapted with permission from https://arc-ts.umich.edu/migrating-from-torque-to-slurm/

See Also:




Keywords:quest,slurm,moab,torque,hpc   Doc ID:89454
Owner:Research Computing .Group:Northwestern
Created:2019-02-01 17:22 CDTUpdated:2019-04-15 13:14 CDT
Sites:Northwestern
Feedback:  0   0