.. index:: pair: Tutorial; Index .. _MPI_jobs: ================ MPI Jobs ================ For more information about which MPI libraries are available on Midway1 and Midway2, see :ref:`MPI_libraries`. Let's look at an example MPI hello world program and explain the steps needed to compile and submit it to the queue. Here is the example MPI hello world program: :download:`hello-mpi.c` .. literalinclude:: hello-mpi.c :language: c Place ``hello-mpi.c`` in your home directory and compile this program interactively by entering the following commands into a terminal on either Midway1 or Midway2 login nodes:: module load openmpi mpicc hello-mpi.c -o hello-mpi If you choose :download:`hello-mpi_midway1.sbatch` to submit your job to Midway1, you have to run the above commands on one of Midway1 login nodes. Alternatively, you have to run the above commands on one of Midway2 login nodes if you choose :download:`hello-mpi_midway2.sbatch` to submit your job to Midway2. The reason that we can run the same commands on both Midway1 and Midway2 login nodes is that we are using the default version of the OpenMPI module which defaults to the system GCC compiler. **Please note** that the default version of a module on Midway1 and Midway2 could be different. For example, the default version of the OpenMPI module on Midway1 is 1.6 whereas the default version of the OpenMPI module on Midway2 is 2.0.1. It should be possible to use any available MPI compiler to compile and run this example. :download:`hello-mpi_midway1.sbatch` is a submission script that can be used to submit a job to Midway1 to run the ``hello-mpi`` program: .. literalinclude:: hello-mpi_midway1.sbatch :language: bash :download:`hello-mpi_midway2.sbatch` is a submission script that can be used to submit a job to Midway2 to run the ``hello-mpi`` program: .. literalinclude:: hello-mpi_midway2.sbatch :language: bash **Note:** Midway1 and Midway2 have different set of modules. Please make sure you use the correct module name and version when submitting your job to each cluster. The inline comments describe what each line does, but it is important to emphesize the following points for MPI jobs: * The :option:`--constraint=fdr` or :option:`--constraint=edr` options are only availble on Midway2 and using them on Midway1 will result in a job submission error. MPI jobs submitted without this option on Midway2 could run on nodes with FDR, EDR, or combination of both. * The :option:`--partition` option will determine whether your job runs on Midway1 or Midway2. If you do not have this option in your job submission script, your job will be submitted to the ``sandyb`` partition on Midway1. * :command:`mpirun` does not need to be given :option:`-n`. All supported MPI environments automatically determine the proper layout based on the Slurm options You can submit ``hello-mpi_midway1.sbatch`` using the following command from one of Midway1 login nodes to Midway1:: sbatch hello-mpi_midway1.sbatch Alternatively, you can submit ``hello-mpi_midway2.sbatch`` using the following command from one of Midway2 login nodes to Midway2:: sbatch hello-mpi_midway2.sbatch Here is an example output of this program submitted to the ``sandyb`` partition on Midway1 (please note that MPI processes can run on any node that has availble cores and memory): .. literalinclude:: hello-mpi.out It is possible to control the number of tasks run per node with the :option:`--ntasks-per-node` option. Submitting the job like this to Midway1:: sbatch --ntasks-per-node=1 hello-mpi_midway1.sbatch Results in an output like this (each MPI process from your job will run on a different node): .. literalinclude:: hello-mpi1task.out Advanced Usage -------------- Both OpenMPI and IntelMPI have the ability to launch MPI programs directly with the Slurm command :command:`srun`. It is not necessary to use this mode for most jobs, but it may allow job launch options that would not otherwise be possible. For example, from a Midway1 login node it is possible to launch the above ``hello-mpi`` program using OpenMPI to run using 16 MPI processes in the ``sandyb`` partition with this command:: srun -n16 hello-mpi For IntelMPI, it is necessary to set an environment variable for this to work:: export I_MPI_PMI_LIBRARY=/software/slurm-current-$DISTARCH/lib/libpmi.so srun -n16 hello-mpi If you want to submit an MPI job to Midway2 using OpenMPI with :command:`srun`, use the following command:: srun -n16 --partition=broadwl hello-mpi For IntelMPI on Midway2, you need to set the ``I_MPI_PMI_LIBRARY`` variable and then run :command:`srun`:: export I_MPI_PMI_LIBRARY=/software/slurm-current-$DISTARCH/lib/libpmi.so srun -n16 --partition=broadwl hello-mpi