MPI === .. toctree:: :maxdepth: 1 :caption: Contents: Employing the right MPI library for your project ------------------------------------------------ If you are running a pre-compiled parallel binary code (compiled elsewhere, brought to Discoverer HPC for execution afterwards), which is linked against certain MPI library, you need to know that library type (one of: Open MPI, Intel MPI, MPICH, MVAPICH2), the target CPU architecture and SIMD instructions. Your MPI binary code can be successfully executed on the compute nodes of Discoverer HPC, only if it properly interacts with the underlying set of system and application libraries, and matches the CPU architecture and supported SIMD instructions (see :doc:`resource_overview`). .. warning:: Currently, we do not support MVAPICH2 on Discoverer, neither do we have plans to employ that MPI library in the nearest future. If you think that the MPI libraries, available in the software repository, do not match the requirements of your code, free to bring and install, under your home folder, the "right" MPI library that is expected to work for you. In that case, you and you alone are responsible for the support of that MPI library, as well as for adapting your Slurm scripts to load it. .. important:: Only the collection of MPI libraries, compiled and installed in the public software repository on Discoverer HPC, are supported by the Discoverer HPC suppot team! Intel MPI --------- .. warning:: Running Intel MPI on AMD EPYC 7H12 CPU can cause problems. Our recommendation is to avoid the use of Intel MPI on the compute nodes of Discoverer, until we find a permanent solution to the reported problems. Open MPI -------- .. important:: All `Open MPI`_ versions/builds available in the public repository LustreFS. They also support Mellanox UCX Framework through MLX provider and interact with Slurm. To obtain the complete list of installed versions and builds of Open MPI, execute the following command on the login node: .. code:: bash module avail openmpi The list will contain the names of the environment modules providing access to the supported Open MPI versions and builds. Supporting such a variety of builds is aimed at providing all compiler sets in use on Discoverer with MPI compiler wrappers :doc:`gcc`, :doc:`aocc`, :doc:`oneapi` (both LLVM and classic compilers), :doc:`llvm`, and :doc:`nvidia_hpc_sdk`. Note that to run an executable linked against a certain version of Open MPI, there is no need to use the environment module that matches the compiler set used to build that binary code. In most cases, it might be pretty enough to load the GCC build of Open MPI to run your Open MPI-linked executable, regardless of the Open MPI wrappers you employed previously during the compilation process. To load the environment modules that correspond to the latest version of Open MPI, follow the examples below: Intel oneAPI (classic compilers) ................................. .. code:: bash module load openmpi/4/intel/latest NVIDIA HPC SDK (former PGI) ........................... .. code:: bash module load mpich/4/nvidia/latest .. important:: NVIDIA HPC SDK installation comes with its own Open MPI distribution. To access that distribution you need to load the corresponding modules: .. code:: bash module load nvidia module load nvhpc-hpcx/latest Our recommendation is to stick to that particular Open MPI distribution (version 4.x) when employing the NVIDIA HPC SDK compilers. Also note that some of the NVIDA HPC SDK versions come with Open MPI 3.x. That library and its compiler wrappers can be accessed by loading the following modules: .. code:: bash module load nvidia module load nvhpc-openmpi3 AMD AOCC ........ .. code:: bash module load openmpi/4/aocc/latest Vanilla LLVM compilers ...................... .. code:: bash module load openmpi/4/llvm/latest GCC ... .. code:: bash module load openmpi/4/gcc/latest For any further details about using the MPI library for compiling and running parallel code, please refer to the `Open MPI Documentation`_. If you would like to see the build recipe we developed for compiling the Open MPI code, visit: https://gitlab.discoverer.bg/vkolev/recipes/-/tree/main/openmpi/ MPICH ----- .. important:: All `MPICH`_ versions/builds available in the public repository are built using ROMIO driver layer, which means they support LustreFS. They also support Mellanox UCX Framework through MLX provider and interact with Slurm. To obtain the complete list of installed versions and builds of MPICH, execute the following command on the login node: .. code:: bash module avail mpich The list will contain the names of the environment modules providing access to the different MPICH versions and builds. Supporting such a variety of builds is aimed at providing all compiler sets in use on Discoverer with MPI compiler wrappers :doc:`gcc`, :doc:`aocc`, :doc:`oneapi` (both LLVM and classic compilers), :doc:`llvm`, and :doc:`nvidia_hpc_sdk`. Note that to run an executable linked against a certain version of MPICH, you do not need to use the environment module that matches the compiler set used to build the binary code. In most cases, it might be enough to load the GCC build to run your MPICH-linked executable, regardless of the MPICH wrappers you employed to build it. To load the environment modules that correspond to the latest version of MPICH, follow the examples below: Intel oneAPI (classic compilers) ................................ .. code:: bash module load mpich/4/intel/latest-classic Intel oneAPI (LLVM compilers) ............................. .. code:: bash module load mpich/4/intel/latest-llvm or .. code:: bash module load mpich/4/intel/latest NVIDIA HPC SDK (former PGI) ........................... .. code:: bash module load mpich/4/nvidia/latest AMD AOCC ........ .. code:: bash module load mpich/4/aocc/latest Vanilla LLVM compilers ...................... .. code:: bash module load mpich/4/llvm/latest GCC ... .. code:: bash module load mpich/4/gcc/latest For any further details about using the MPI library for compiling and running parallel code, please refer to the `MPICH Documentation`_. If you would like to see the build recipe we developed for compiling the MPICH code, visit: https://gitlab.discoverer.bg/vkolev/recipes/-/blob/main/mpich/ Getting help ------------ See :doc:`help` .. _`Mellanox UCX Framework through MLX provider`: https://www.intel.com/content/www/us/en/developer/articles/technical/improve-performance-and-stability-with-intel-mpi-library-on-infiniband.html .. _`Intel MPI Library Developer Guide for Linux OS`: https://www.intel.com/content/dam/develop/external/us/en/documents/mpi-devguide-linux.pdf .. _`Open MPI`: https://www.open-mpi.org .. _`Open MPI Documentation`: https://www.open-mpi.org/doc .. _`MPICH`: https://www.mpich.org .. _`MPICH Documentation`: https://www.mpich.org/documentation/guides/