Submission Script Basics
What is a submission script
A submission script describes the resources you need the resource manager (SLURM) to allocate to your job. It also contains the commands needed to execute the application(s) you wish to run, including any set-up the application(s) may require.
The script needs to be created using a Linux text editor such as nano or vi - we recommend creating and editing submission scripts on the cluster rather than editing them on a Windows machine, as this can cause problems.
Simple submission script example
In this example we are going to create a submission script to run a test application on the cluster.
Note
If you are not familiar with typing commands into a Linux shell (or terminal window) we suggest you follow an online Linux tutorial such as the Linux tutorial at Ryans Tutorials
To begin with we change directory to your $DATA area and make a new directory named example to work in:
cd $DATA
mkdir example
cd example
The next step is to create the submission script to describe your job to SLURM.
First we use the nano editor to create the file:
nano submit.sh
This command will start the Linux nano editor. You can use nano to add the following lines:
#! /bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --time=00:10:00
#SBATCH --partition=devel
module load mpitest
mpirun mpihello
Note
Ensure the #! /bin/bash line is the first line of the script and the #SBATCH lines start at the beginning of the line.
To exit nano hold CTRL-X then answer Y to the question Save Modified buffer? then hit enter when asked File Name to Write: submit.sh
The anatomy of the script
The first line #! /bin/bash tells Linux that this file is a script which can be run by the BASH shell interpreter.
The following #SBATCH lines request specific cluster resources:
--nodes=2 requests two ARC nodes
--ntasks-per-node=4 requests 4 cores per node (a total of 8)
--time=00:10:00 requests a run time of 10 minutes (the maximum for the devel partition)
--partition=devel requests that this job runs on the devel partition, which is reserved for testing
module load mpihello The module load command is used to make an application environment available to use in your job, in this case the mpitest application.
mpirun mpihello This line runs the mpihello command using the special mpirun wrapper.
Note
MPI is required for multi-process operation across nodes, and may not be appropriate for all applications.
Submitting the job
Now that you have a submission script, you can submit it to the SLURM resource manager. To do this type the following at the command line:
sbatch submit.sh
SLURM will respond with:
sbatch: CPU resource required, checking settings/requirements...
Submitted batch job nnnnnnn
Where nnnnnnn is your job ID number.
This job should run very quickly, but you may be able to find it in the job queue by typing:
squeue -u $USER
If it is running, you will see something like:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
nnnnnnn devel submit.s ouit0554 R 0:07 2 arc-c[302-303]
If the job is waiting to run (because another user is using the devel nodes) you will see:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
nnnnnnn devel submit.s ouit0554 PD 0:00 2 (None)
The difference being that in the first case you can see the job state is R for RUNNING and in the second it is PD for PENDING and it has not been allocated nodes in the NODELIST
Job output
When your job completes, i.e. it is no longer showing in the job queue, you should find the SLURM output in a file named slurm-nnnnnnn.out where nnnnnnn is the
job ID of your completed job.
To view this output you can use the Linux cat command, so if our job ID was 2227191, we would use the command:
cat slurm-2227191.out
This would give the output:
Hello world from processor arc-c302, rank 0 out of 8 processors
Hello world from processor arc-c302, rank 1 out of 8 processors
Hello world from processor arc-c302, rank 2 out of 8 processors
Hello world from processor arc-c302, rank 3 out of 8 processors
Hello world from processor arc-c303, rank 4 out of 8 processors
Hello world from processor arc-c303, rank 5 out of 8 processors
Hello world from processor arc-c303, rank 6 out of 8 processors
Hello world from processor arc-c303, rank 7 out of 8 processors
The above being the output from running the mpihello application on the 8 CPUs that we requested, and you can see it ran with 4 processes on arc-c302 and 4 on arc-c303