WRF-SFIRE (self-support) ======================== .. toctree:: :maxdepth: 1 :caption: Contents: About ----- `WRF-SFIRE`_ is fire forecast package based on `WRF-ARW modelling system`_. Versions supported ------------------ .. note:: We offer our support for the compilation and execution of all WRF-SFIRE versions based on WRF 4, provided that they are compiled against the bundle of libraries provided by our support team. The users are responsible for the compilation of the WRF-SFIRE code (we call this "self-support"). If they fail in doing that, they should ask for :doc:`help`. Supported versions of the bundle ................................ We provide support for building and running WRF-SFIRE based on a bundle of external libraries that have been compiled and tested by our support team. Users of Discoverer HPC partitions can compile and run the WRF-SFIRE code by themselves, using that bundle as middleware. Such a method of building and using WRF-SFIRE should be considered self-supporting. .. warning:: Here, the use of the term "self-support" implies that the users themselves are accountable for compiling the code of WRF-SFIRE using the assortment of external libraries provided and supported by the Discoverer HPC support team. The latter means that the Discoverer HPC support team assumes only liability concerning problems caused by the libraries included in the bundle. The bundle for supporting the WRF-SFIRE code compilation and execution contains the following optimized and tested libraries: * zlib * libaec (szip) * lz4 * lzma * bzip2 * zstd * hdf5 * netcdf-c * netcdf-fortran The libraries installations have been compiled and optimized to run efficiently on the AMD Zen2 CPU microarchitecture. That largely ensures that the WRF-SFIRE code that relies on linking to external libraries in the bundle runs as fast as possible on Discoverer's compute nodes. To check which versions of the WRF-SFIRE code are covered by the bundle, execute on the login node: .. code-block:: bash module avail WRF-SFIRE/ In the list of module provided, you could find which module covers your version precisely. For instance: * WRF-SFIRE/bundle-support-4.4-S0.1-gcc-base-openmpi4 means that this particular bundle covers WRF-SFIRE 4.4-S0.1 and it is build using baseline GCC against Open MPI v4. It should be noted that this module will facilitate access to the library bundle, which may encompass all version 4 lines of WRF-SFIRE code. Upon loading, the module, apart from other management of the current shell environment, defines the following variables: * NETCDF * HDF5_PLUGIN_PATH Compiling the WRF-SFIRE code ............................ .. important:: Load the environment module that provides access to the bundle with external libraries (see above). Use git to clone the WRF-SFIRE code including the submodules .. code-block:: bash $ git clone https://github.com/openwfm/WRF-SFIRE.git $ cd WRF-SFIRE $ git submodule update --init --recursive Afterwards, load the Open MPI 4.1.6 environment module and generate the build configuration file: .. code-block:: bash $ module load openmpi/4/gcc/4.1.6 $ echo "35" | ./configure Finally, select the case your want to compile. For instance, if you need to compile ``em_fire`` execute: .. code-block:: bash $ ./compile -j 4 em_fire Upon successful compilation, bring the executables and all necessary input files into a folder you selected for running the simulations. Running WRF-SFIRE ................. .. warning:: **You MUST NOT execute simulations directly upon the login node (login.discoverer.bg).** You have to run your simulations as Slurm jobs only. .. warning:: Write the results only inside your :doc:`scratchfolder` and DO NOT use for that purpose (under any circumstances) your :doc:`homefolder`! To run ``real.exe`` and ``wrf.ext`` as a Slurm batch job, you may use the following template (be sure the correct bundle is loaded, default one is loaded below): .. code:: bash #!/bin/bash # #SBATCH --partition=cn # Name of the partition of nodes (as the support team) #SBATCH --job-name=wrf_sfire #SBATCH --time=02:50:00 # The job completes for ~ 6 min #SBATCH --nodes 2 # Two nodes will be used #SBATCH --ntasks-per-node 128 # Use all 128 CPU cores on each node #SBATCH --ntasks-per-core 1 # Run only one MPI process per CPU core #SBATCH --cpus-per-task 2 # Number of OpenMP threads per MPI process # That means Shared Memory parallelism is involved. #SBATCH --account= #SBATCH --qos= #SBATCH -o slurm.%j.out # STDOUT #SBATCH -e slurm.%j.err # STDERR ulimit -Hs unlimited ulimit -Ss unlimited module purge module load WRF-SFIRE/bundle-support-4.4-S0.1-gcc-base-openmpi4 module load openmpi/4/gcc/4.1.6 export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} export OMP_PROC_BIND=false export OMP_SCHEDULE='STATIC' export OMP_WAIT_POLICY='ACTIVE' export UCX_NET_DEVICES=mlx5_0:1 export PATH=/path/to/your/wrf-sfire/folder/with/executables:${PATH} mpirun -np 4 real.exe # Process the input NC data sets, here you do not # need to request the use of more than 4 MPI tasks # Use it in case you need that preprocessing to take place. mpirun wrf.exe # Run the actual simulation Specify the parmeters and resources required for successfully running and completing the job: - Slurm partition of compute nodes, based on your project resource reservation (``--partition``) - job name, under which the job will be seen in the queue (``--job-name``) - wall time for running the job (``--time``) - number of occupied compute nodes (``--nodes``) - number of MPI proccesses per node (``--ntasks-per-node``) - number of threads (OpenMP threads) per MPI process (``--cpus-per-task``) .. note:: The requested number of MPI processes per node should not be greater than 128 (128 is the number of CPU cores per compute node, see :doc:`resource_overview`). You need to submit the Slurm batch job script to the queue from within the folder where the input NC and ``namelist.input`` files reside. Check the provided working example (see below) to find more details about how to create a complete Slurm batch job script for running WRF-SFIRE. Why do most users fail to compile WRF-SFIRE against NetCDF multi-package installations? --------------------------------------------------------------------------------------- Why are so many HPC users fail to compile the WRF and WRF-SFIRE code on Discoverer or on any other system that provides separate installations of NetCDF-C and NetCDF-Fortran? It is not exactly their own fault. The problem arises due to the implementation of rather old conception for pointing the ``configure`` script of WRF and WRF-SFIRE to the actual NetCDF installation. According to that conception, the NetCDF installation consists of as a single tree that hosts both C and Fortran NetCDF headers and libraries in one place. On the other side, the maintainer of the NetCDF package made the decision long ago to divide the NetCDF packages into packages with distinct compiler affinity, namely NetCDF-C, NetCDF-Fortran, and NetCDF-C++, each of which is currently maintained separately and with its own version - and own installer. The latter means that the usual way NetCDF-C and NetCDF-Fortran become installed, not only on Discoverer, but anywhere whereupon optimization HPC software collection is maintained, is based on two separate trees - one for each package. For instance, if we follow the default installation procedure adopted for NetCDF-C, NetCDF-Fortran, and NetCDF-C++, we get one itallation tree per package: .. code-block:: bash /opt/software/netcdf-c/4/4.9.2-gcc-openmpi /opt/software/netcdf-fortran/4/4.6.1-gcc-openmpi /opt/software/netcdf-cxx4/4/4.3.1-gcc-openmpi But the ``configure`` script requires NETCDF variable to be given single path, like this: .. code-block:: bash $ export NETCDF=/opt/software/netcdf which means we cannot employ the multipackage installation of NetCDF for that purpose. One natural workaround here is to install, at least NetCDF-C and NetCDF-Fortran, in the same destination folder, for instance: .. code-block:: bash /opt/software/WRF-SFIRE/bundle/4/4.4-S0.1 This is a bundle that can be used to successfully compile and link programming code against that particular mix of different packages. WRF and WRF-SFIRE are examples of this type of code. In fact, if one needs to compile WRF and WRF-SFIRE against NetCDF relying solely on the packages brought to the system by the Linux packaging system, it is usually sufficient to specify: .. code-block:: bash $ export NETCDF=/usr Sadly, the HPC software environment is a rather complex one, and it is rare to rely there on the NetCDF packages that come with the Linux distribution. It is not that those packages are wrong — they just might not be in the required versions or may not be compiled and optimized to match certain requirements important for running productively applications on the HPC compute nodes. And here comes the HPC way of installing packages — they are compiled and optimized to match specific jobs and maintained as separated installations, as well as accessible as such. Furthermore, they are accessible through environment modules as separate installation trees. In other words, if someone need access to a highly optimized and modern version of a certain library, they will find it as a separately maintained installation, not as a part of the collection in /usr. Here comes the role of the bundles of libraries. They provide an environment that mixes certain packages to match the requirements of external library support demanded by specific build or run-time environments. Applied to the compilation of WRF and WRF-SFIRE, that implies a bundle that contains both NetCDF-C and NetCDF-Fortran, for instance: .. code-block:: bash $ export NETCDF=/opt/software/WRF-SFIRE/bundle/4/4.4-S0.1 Of course, one may patch the code of the ``configure`` and ``compile`` scripts and introduce there two separate environment variables for handling separate installations of NetCDF-C and NetCDF-Fortran. Nonetheless, that dual-like approach is not included in the official distribution of WRF and WRF-SFIRE code. Getting help ------------ See :doc:`help` .. _`WRF-SFIRE`: https://github.com/openwfm/WRF-SFIRE/ .. _`WRF-ARW modelling system`: https://www2.mmm.ucar.edu/wrf/users/