ORCA¶
Versions avaiable¶
Supported versions¶
Note
The versions of ORCA installed in the software repository are brought there as binary files provided by the developers of the software. They are not compiled by Discoverer HPC support team!
To check which ORCA versions are currently supported on Discoverer, execute on the login node:
module avail orca/
User-supported installations¶
Important
Users are welcome to bring and install ORCA within their scratch folders, but those installations will not be supported by the Discoverer HPC team.
Running simulations¶
Running simulations means invoking orca
for processing the instructions and data given in the input file.
Warning
You MUST NOT execute simulation directly upon the login node (login.discoverer.bg). You have to run your simulations as Slurm jobs only.
Warning
Write your results only inside your Personal scratch and storage folder (/discofs/username) and DO NOT use for that purpose (under any circumstances) your Home folder (/home/username)!
Parallel (MPI)¶
To run ORCA in parallel mode, use the following Slurm batch template:
#!/bin/bash
#
#SBATCH --partition=cn # Partition name
## ask Discoverer HPC support team for
## clarification
#SBATCH --job-name=orca # Job Name
#SBATCH --time=01:00:00 # WallTime
#SBATCH --nodes 2 # Number of nodes
#SBATCH --ntasks-per-node 128 # Number of MPI threads per node
#SBATCH --ntasks-per-core 1 # Do not change this!
#SBATCH -o slurm.%j.out # STDOUT
#SBATCH -e slurm.%j.err # STDERR
module purge
module load orca/5/latest
export UCX_NET_DEVICES=mlx5_0:1
cd $SLURM_SUBMIT_DIR
ORCA=`which orca`
$ORCA /path/to/your/input.inp
Specify the parameters and resources required for successfully running and completing the job:
- the Slurm partition of compute nodes, based on your project resource reservation (
--partition
)- the job name, under which the job will be seen in the queue (
--job-name
)- the wall time for running the job (
--time
)- the number of occupied compute nodes (
--nodes
), see Notes on the parallelization- number of MPI proccesses per node (
--ntasks-per-node
), see Notes on the parallelization- you may get rid of
--nodes
and--ntasks-per-node
and fully rely on--ntasks
to let Slurm distribute the tasks on the nodes- specify the version of ORCA to run after
module load
(see Supported versions)- do not change the
export
declarations unless you are told to do so
Attention
Carefully check the
Save the complete Slurm job description as a file, for example /discofs/$USER/run_orca/run_orca_mpi.batch
, and submit it to the queue:
cd /discofs/$USER/run_orca
sbatch run_orca_mpi.batch
Upon successful submission, the standard output will be directed into the file /discofs/$USER/run_orca/slurm.%j.out
(where %j
stands for the Slurm job ID.
Serial (no MPI)¶
Running serial jobs of ORCA is rather rare. It might be helpful for very short jobs and for calibration. Given below is a simple template for a Slurm job that runs serial ORCA simulation:
#!/bin/bash
#
#SBATCH --partition=cn # Partition name
## ask Discoverer HPC support team for
## clarification
#SBATCH --job-name=orca # Job Name
#SBATCH --time=01:00:00 # WallTime
#SBATCH --nodes 1 # Number of nodes
#SBATCH -o slurm.%j.out # STDOUT
#SBATCH -e slurm.%j.err # STDERR
module purge
module load orca/5/latest
cd $SLURM_SUBMIT_DIR
ORCA=`which orca`
$ORCA /path/to/your/input.inp
Modify the parameters to specify the required resources for running the job:
- the Slurm partition of compute nodes, based on your project resource reservation (
--partition
)- the job name, under which the job will be seen in the queue (
--job-name
)- the wall time for running the job (
--time
)
Save the complete Slurm job description as a file, for example /discofs/$USER/run_orca/run_orca_serial.batch
, and submit it to the queue:
cd /discofs/$USER/run_orca
sbatch run_orca_serial.batch
Upon successful submission, the standard output will be directed into the file /discofs/$USER/run_orca/slurm.%j.out
(where %j
stands for the Slurm job ID.
Notes on the parallelization¶
Warning
It is absolutely necessary to have the number of MPI parellel processes declared in ORCA input file (assigned to NPROCS
), specified in the Slurm batch job description as a combination of nodes
and ntasks-per-node
. In other words NPROCS
= nodes
x ntasks-per-node
.
For examples, if the ORCA input file contains the declaration:
%PAL NPROCS 48 END
a total of 48 MPI processes have to be requested as a part of the Slurm bash job description. Here comes the complexity: one might allocate those 48 MPI processes across one:
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 48
or multiple compute nodes:
#SBATCH --nodes 2
#SBATCH --ntasks-per-node 24
Each of the compute nodes operated by Discoverer HPC has 128 CPU cores (see Resource Overview). Unfortunately, running 128 MPI processes on one node might not be as effective as expected, especially if the amount of memory allocated for each MPI process is large.
Based on previous experience, most of ORCA jobs run very fast based on 16 MPI processes per compute node.
Important
Each particular input ORCA job configuration should be tested for productivity based on a series of benchmarks. The users are advised to start with 1 node and 16 MPI processes, then to run the same job 2 nodes and 16 MPI processes per node, and so on.
Getting help¶
See Getting help