Submission Script Basics
What is a submission script
A submission script
describes the resources you need the resource manager (SLURM) to allocate to your job. It also contains the commands needed to execute the application(s) you wish to run, including any set-up the application(s) may require.
The script needs to be created using a Linux text editor such as nano
or vi
- we recommend creating and editing submission scripts on the cluster rather than editing them on a Windows machine, as this can cause problems.
Simple submission script example
In this example we are going to create a submission script to run a test application on the cluster.
Note
If you are not familiar with typing commands into a Linux shell (or terminal window) we suggest you follow an online Linux tutorial such as the Linux tutorial at Ryans Tutorials
To begin with we change directory to your $DATA
area and make a new directory named example
to work in:
cd $DATA
mkdir example
cd example
The next step is to create the submission script to describe your job to SLURM.
First we use the nano
editor to create the file:
nano submit.sh
This command will start the Linux nano
editor. You can use nano
to add the following lines:
#! /bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --time=00:10:00
#SBATCH --partition=devel
module load mpitest
mpirun mpihello
Note
Ensure the #! /bin/bash
line is the first line of the script and the #SBATCH
lines start at the beginning of the line.
To exit nano
hold CTRL-X
then answer Y
to the question Save Modified buffer?
then hit enter
when asked File Name to Write: submit.sh
The anatomy of the script
The first line #! /bin/bash
tells Linux that this file is a script which can be run by the BASH shell interpreter.
The following #SBATCH
lines request specific cluster resources:
--nodes=2
requests two ARC nodes
--ntasks-per-node=4
requests 4 cores per node (a total of 8)
--time=00:10:00
requests a run time of 10 minutes (the maximum for the devel
partition)
--partition=devel
requests that this job runs on the devel
partition, which is reserved for testing
module load mpihello
The module load command is used to make an application environment available to use in your job, in this case the mpitest application.
mpirun mpihello
This line runs the mpihello command using the special mpirun wrapper.
Note
MPI is required for multi-process operation across nodes, and may not be appropriate for all applications.
Submitting the job
Now that you have a submission script, you can submit it to the SLURM resource manager. To do this type the following at the command line:
sbatch submit.sh
SLURM will respond with:
sbatch: CPU resource required, checking settings/requirements...
Submitted batch job nnnnnnn
Where nnnnnnn
is your job ID number.
This job should run very quickly, but you may be able to find it in the job queue by typing:
squeue -u $USER
If it is running, you will see something like:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
nnnnnnn devel submit.s ouit0554 R 0:07 2 arc-c[302-303]
If the job is waiting to run (because another user is using the devel
nodes) you will see:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
nnnnnnn devel submit.s ouit0554 PD 0:00 2 (None)
The difference being that in the first case you can see the job state is R
for RUNNING and in the second it is PD
for PENDING and it has not been allocated nodes in the NODELIST
Job output
When your job completes, i.e. it is no longer showing in the job queue, you should find the SLURM output in a file named slurm-nnnnnnn.out
where nnnnnnn
is the
job ID of your completed job.
To view this output you can use the Linux cat
command, so if our job ID was 2227191, we would use the command:
cat slurm-2227191.out
This would give the output:
Hello world from processor arc-c302, rank 0 out of 8 processors
Hello world from processor arc-c302, rank 1 out of 8 processors
Hello world from processor arc-c302, rank 2 out of 8 processors
Hello world from processor arc-c302, rank 3 out of 8 processors
Hello world from processor arc-c303, rank 4 out of 8 processors
Hello world from processor arc-c303, rank 5 out of 8 processors
Hello world from processor arc-c303, rank 6 out of 8 processors
Hello world from processor arc-c303, rank 7 out of 8 processors
The above being the output from running the mpihello
application on the 8 CPUs that we requested, and you can see it ran with 4 processes on arc-c302
and 4 on arc-c303