WRF-SFIRE (self-support)¶
About¶
WRF-SFIRE is fire forecast package based on WRF-ARW modelling system.
Versions supported¶
Note
We offer our support for the compilation and execution of all WRF-SFIRE versions based on WRF 4, provided that they are compiled against the bundle of libraries provided by our support team. The users are responsible for the compilation of the WRF-SFIRE code (we call this “self-support”). If they fail in doing that, they should ask for Getting help.
Supported versions of the bundle¶
We provide support for building and running WRF-SFIRE based on a bundle of external libraries that have been compiled and tested by our support team. Users of Discoverer HPC partitions can compile and run the WRF-SFIRE code by themselves, using that bundle as middleware. Such a method of building and using WRF-SFIRE should be considered self-supporting.
Warning
Here, the use of the term “self-support” implies that the users themselves are accountable for compiling the code of WRF-SFIRE using the assortment of external libraries provided and supported by the Discoverer HPC support team. The latter means that the Discoverer HPC support team assumes only liability concerning problems caused by the libraries included in the bundle.
The bundle for supporting the WRF-SFIRE code compilation and execution contains the following optimized and tested libraries:
- zlib
- libaec (szip)
- lz4
- lzma
- bzip2
- zstd
- hdf5
- netcdf-c
- netcdf-fortran
The libraries installations have been compiled and optimized to run efficiently on the AMD Zen2 CPU microarchitecture. That largely ensures that the WRF-SFIRE code that relies on linking to external libraries in the bundle runs as fast as possible on Discoverer’s compute nodes.
To check which versions of the WRF-SFIRE code are covered by the bundle, execute on the login node:
module avail WRF-SFIRE/
In the list of module provided, you could find which module covers your version precisely. For instance:
- WRF-SFIRE/bundle-support-4.4-S0.1-gcc-base-openmpi4
means that this particular bundle covers WRF-SFIRE 4.4-S0.1 and it is build using baseline GCC against Open MPI v4. It should be noted that this module will facilitate access to the library bundle, which may encompass all version 4 lines of WRF-SFIRE code.
Upon loading, the module, apart from other management of the current shell environment, defines the following variables:
- NETCDF
- HDF5_PLUGIN_PATH
Compiling the WRF-SFIRE code¶
Important
Load the environment module that provides access to the bundle with external libraries (see above).
Use git to clone the WRF-SFIRE code including the submodules
$ git clone https://github.com/openwfm/WRF-SFIRE.git
$ cd WRF-SFIRE
$ git submodule update --init --recursive
Afterwards, load the Open MPI 4.1.6 environment module and generate the build configuration file:
$ module load openmpi/4/gcc/4.1.6
$ echo "35" | ./configure
Finally, select the case your want to compile. For instance, if you need to compile em_fire
execute:
$ ./compile -j 4 em_fire
Upon successful compilation, bring the executables and all necessary input files into a folder you selected for running the simulations.
Running WRF-SFIRE¶
Warning
You MUST NOT execute simulations directly upon the login node (login.discoverer.bg). You have to run your simulations as Slurm jobs only.
Warning
Write the results only inside your Personal scratch and storage folder (/discofs/username) and DO NOT use for that purpose (under any circumstances) your Home folder (/home/username)!
To run real.exe
and wrf.ext
as a Slurm batch job, you may use the following template (be sure the correct bundle is loaded, default one is loaded below):
#!/bin/bash
#
#SBATCH --partition=cn # Name of the partition of nodes (as the support team)
#SBATCH --job-name=wrf_sfire
#SBATCH --time=02:50:00 # The job completes for ~ 6 min
#SBATCH --nodes 2 # Two nodes will be used
#SBATCH --ntasks-per-node 128 # Use all 128 CPU cores on each node
#SBATCH --ntasks-per-core 1 # Run only one MPI process per CPU core
#SBATCH --cpus-per-task 2 # Number of OpenMP threads per MPI process
# That means Shared Memory parallelism is involved.
#SBATCH --account=<your_slurm_account_name>
#SBATCH --qos=<the_qos_name_you_want_to_follow>
#SBATCH -o slurm.%j.out # STDOUT
#SBATCH -e slurm.%j.err # STDERR
ulimit -Hs unlimited
ulimit -Ss unlimited
module purge
module load WRF-SFIRE/bundle-support-4.4-S0.1-gcc-base-openmpi4
module load openmpi/4/gcc/4.1.6
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export OMP_PROC_BIND=false
export OMP_SCHEDULE='STATIC'
export OMP_WAIT_POLICY='ACTIVE'
export UCX_NET_DEVICES=mlx5_0:1
export PATH=/path/to/your/wrf-sfire/folder/with/executables:${PATH}
mpirun -np 4 real.exe # Process the input NC data sets, here you do not
# need to request the use of more than 4 MPI tasks
# Use it in case you need that preprocessing to take place.
mpirun wrf.exe # Run the actual simulation
Specify the parmeters and resources required for successfully running and completing the job:
- Slurm partition of compute nodes, based on your project resource reservation (
--partition
)- job name, under which the job will be seen in the queue (
--job-name
)- wall time for running the job (
--time
)- number of occupied compute nodes (
--nodes
)- number of MPI proccesses per node (
--ntasks-per-node
)- number of threads (OpenMP threads) per MPI process (
--cpus-per-task
)
Note
The requested number of MPI processes per node should not be greater than 128 (128 is the number of CPU cores per compute node, see Resource Overview).
You need to submit the Slurm batch job script to the queue from within the folder where the input NC and namelist.input
files reside. Check the provided working example (see below) to find more details about how to create a complete Slurm batch job script for running WRF-SFIRE.
Why do most users fail to compile WRF-SFIRE against NetCDF multi-package installations?¶
Why are so many HPC users fail to compile the WRF and WRF-SFIRE code on Discoverer or on any other system that provides separate installations of NetCDF-C and NetCDF-Fortran? It is not exactly their own fault. The problem arises due to the implementation of rather old conception for pointing the configure
script of WRF and WRF-SFIRE to the actual NetCDF installation. According to that conception, the NetCDF installation consists of as a single tree that hosts both C and Fortran NetCDF headers and libraries in one place. On the other side, the maintainer of the NetCDF package made the decision long ago to divide the NetCDF packages into packages with distinct compiler affinity, namely NetCDF-C, NetCDF-Fortran, and NetCDF-C++, each of which is currently maintained separately and with its own version - and own installer. The latter means that the usual way NetCDF-C and NetCDF-Fortran become installed, not only on Discoverer, but anywhere whereupon optimization HPC software collection is maintained, is based on two separate trees - one for each package. For instance, if we follow the default installation procedure adopted for NetCDF-C, NetCDF-Fortran, and NetCDF-C++, we get one itallation tree per package:
/opt/software/netcdf-c/4/4.9.2-gcc-openmpi
/opt/software/netcdf-fortran/4/4.6.1-gcc-openmpi
/opt/software/netcdf-cxx4/4/4.3.1-gcc-openmpi
But the configure
script requires NETCDF variable to be given single path, like this:
$ export NETCDF=/opt/software/netcdf
which means we cannot employ the multipackage installation of NetCDF for that purpose.
One natural workaround here is to install, at least NetCDF-C and NetCDF-Fortran, in the same destination folder, for instance:
/opt/software/WRF-SFIRE/bundle/4/4.4-S0.1
This is a bundle that can be used to successfully compile and link programming code against that particular mix of different packages. WRF and WRF-SFIRE are examples of this type of code. In fact, if one needs to compile WRF and WRF-SFIRE against NetCDF relying solely on the packages brought to the system by the Linux packaging system, it is usually sufficient to specify:
$ export NETCDF=/usr
Sadly, the HPC software environment is a rather complex one, and it is rare to rely there on the NetCDF packages that come with the Linux distribution. It is not that those packages are wrong — they just might not be in the required versions or may not be compiled and optimized to match certain requirements important for running productively applications on the HPC compute nodes. And here comes the HPC way of installing packages — they are compiled and optimized to match specific jobs and maintained as separated installations, as well as accessible as such. Furthermore, they are accessible through environment modules as separate installation trees. In other words, if someone need access to a highly optimized and modern version of a certain library, they will find it as a separately maintained installation, not as a part of the collection in /usr.
Here comes the role of the bundles of libraries. They provide an environment that mixes certain packages to match the requirements of external library support demanded by specific build or run-time environments. Applied to the compilation of WRF and WRF-SFIRE, that implies a bundle that contains both NetCDF-C and NetCDF-Fortran, for instance:
$ export NETCDF=/opt/software/WRF-SFIRE/bundle/4/4.4-S0.1
Of course, one may patch the code of the configure
and compile
scripts and introduce there two separate environment variables for handling separate installations of NetCDF-C and NetCDF-Fortran. Nonetheless, that dual-like approach is not included in the official distribution of WRF and WRF-SFIRE code.
Getting help¶
See Getting help