Batch System
The batch system of the hydra cluster is provided by SLURM.
Using a batch system means, that you submit a job to the batch system which requests certain resources of the compute servers, e.g., how many processors or how much memory should be used. If the requested resources are available the job is started, otherwise execution waits until the resources are available.
All batch commands have to be executed on the front end of the batch system, i.e., hydra. Hence, you
first have to log into hydra
to access the compute nodes of the batch system. Direct login to the batch
nodes is provided for job maintenance only and limited to 10 min.
The batch servers consist of different computer systems, e.g. different CPUs, different amount of main memory, with accelerator cards. This leads to different partitions for running jobs (see below).
Submitting a Job
sbatch
Executes all commands contained in a batch script on the compute servers based on the requested resources.
A minimal batch script looks like:
#!/bin/bash
mycommand arg1 arg2
Saving the file, e.g., as mybatchjob, you can submit it to the batch system as
sbatch mybatchjob
The default resources for the job are one CPU core on one compute server with 8 GB of memory and up to 48 hours runtime.
Resources are requested by one of the following options:
Option |
Resource Request |
---|---|
|
request m nodes (compute servers) |
|
request m tasks per node |
|
request m CPU cores per task |
|
disable hyperthreading |
|
get exclusive access to compute nodes |
|
request m MB of memory per node (m=0 equals all memory) |
|
request m MB of memory per CPU |
|
request job execution on partition name |
To start a job with one task on one node with 4 CPU cores using up to 64G memory one needs to add the following arguments during job submission:
sbatch --nodes=1 --ntasks-per-node=1 --cpus-per-task=4 --mem=65536 mybatchjob
Parameters describing job resources can also be placed in the script by using the prefix #SBATCH
:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=65536
#SBATCH --partition=bdw
mycommand arg1 arg2
This again starts a job with one task on one node with 4 CPU cores using up to 64G memory. In addition, the job is placed in the bdw partition (see below).
You may provide additional options for your job, e.g. a job name or for email notification:
Option |
Resource Request |
---|---|
|
set job name |
|
send email at |
|
set email address for notifications |
Please make sure, that you specify a full email adress, e.g., user@mis.mpg.de. Otherwise this will not work.
Partitions
The batch systems puts the nodes into several partitions
Partition |
Nodes |
Hardware |
---|---|---|
bdw |
bdw01..08 |
2x20 cores Intel Broadwell, 512 GB RAM |
epyc |
epyc01..02 |
2x64 cores AMD Epyc Rome, 512 GB RAM |
cuda |
cuda01..04 |
one or two NVidia accellerator cards |
Please note, that the default partition is bdw!
Interactive Jobs
You may also request an interactive job, i.e., a command line on one of the batch nodes via:
srun --pty -u bash -i
All of the above parameters for sbatch
are also available (and should be used!) for interactive sessions.
Please note that graphical applications are currently not supported in interactive sessions. For this, please use the interactive compute servers.
Array jobs
Jobs with identical parameters, so called array jobs, may also be submitted. For this, the parameter
--array
of the sbatch
command is available. The parameter expects an array specification, which is
a list of array indices:
sbatch --array 0,1,2,3,4 ...
or a range specifier:
sbatch --array 0-16:4 ...
The step width (:4
) is optional and defaults to `.
All of the above may also be combined:
sbatch --array 0-16:4,32 ...
Within the batch script the individual tasks of the array job may be distinguished by using the
SLURM_ARRAY_TASK_ID
environment variable.
CPU affinity
Affinity of user programs to CPU cores is set by SLURM as requested by the user with the above options. The following table contains some typical configurations:
Configuration |
Arguments |
---|---|
1 task/node, 2 CPUs/task, w/o HT |
|
1 task/node, 2 CPUs/task, w/ HT |
|
2 tasks/node, 1 CPUs/task, w/ HT |
|
2 tasks/node, 1 CPUs/task, w/o HT |
|
The values for --cpus-per-task
correspond to nodes in the bdw partition. For the epyc
partition you should set --cpus-per-task
to 128.
Job Control
squeue
To view the currently allocated/running jobs, the sqeue
command is available. By default all jobs will be
shown. To limit the output to jobs of a specific user use the parameter -u
:
squeue -u <username>
scancel
The command scancel
cancels specific jobs identified by their job id, which is either printed while
submitting your job or shown in the output of squeue
:
scancel <jobid>
To cancel all jobs of a user, again use the parameter -u
:
scancel -u <username>
sinfo
Shows various information about the state of the partitions.
MPI
The default MPI implementation on the compute servers is OpenMPI v4.1. All programs installed via default Linux packages will use this.
Recommended is however the Intel MPI library via our module system:
module load impi
This will work for programs and libraries compiled by you, e.g., not with default programs.
An example batch script will look like
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=20
#SBATCH --mem=65536
#SBATCH --partition=bdw
module purge
module load impi
srun my-program arg1 arg2 ...
which will launch a job with 4 MPI processes (ranks) with each process (rank) using 20 CPU cores and 64GB memory.
CUDA
The cuda nodes have different accelerator cards installed (see Hardware). To run on either of them, just choose the cuda partition:
sbatch -p cuda ...
You can also choose a specific GPU for your job with the –gres parameter, which can either be v100, titanv or a100 together with the number of GPUs requested:
sbatch -p cuda --gres gpu:v100:1 ...
Tensorflow
Tensorflow requires specific combinations of CUDA (and cuDNN). It is therefore recommended to use virtual environments for Tensorflow.
For Tensorflow v2.13:
module load cuda/11.8
module load cudnn/8.6
python3 -m venv tf-2.13
tf-2.13/bin/pip install tensorflow==2.13
tf-2.13/bin/pip install nvidia-cudnn-cu11==8.6.0.163
For Tensorflow v2.14:
module load cuda/11.8
module load cudnn/8.7
python3 -m venv tf-2.14
tf-2.14/bin/pip install tensorflow==2.14
tf-2.14/bin/pip install nvidia-cudnn-cu11==8.7.0.84
Note
To list available versions of nvidia-cudnn-cu11 run
pip install --use-deprecated=legacy-resolver nvidia-cudnn-cu11==
Matlab
To run Matlab programs using the batch system, copy the following lines into your batch script:
source /etc/profile.d/modules.sh
source /etc/profile.d/opt_local_modules.sh
module load matlab
matlab -nosplash -nodesktop -nojvm -r "run('myprogram.m');quit"
where your replace myprogram.m with the corresponding Matlab file containing your instructions.
Don’t forget to add additional resource requests for the number of tasks, CPU cores or main memory as described above.