Hybrid MPI/OpenMP Jobs¶
MPI and OpenMP can be used at the same time to create a Hybrid MPI/OpenMP program.
Let’s look at an example Hybrid MPI/OpenMP hello world program and explain the
steps needed to compile and submit it to the queue. An example hybrid MPI hello world
program: hello-hybrid.c
#include <stdio.h>
#include "mpi.h"
#include <omp.h>
int main(int argc, char *argv[]) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
int iam = 0, np = 1;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
#pragma omp parallel default(shared) private(iam, np)
{
np = omp_get_num_threads();
iam = omp_get_thread_num();
printf("Hello from thread %d out of %d from process %d out of %d on %s\n",
iam, np, rank, numprocs, processor_name);
}
MPI_Finalize();
}
Place hello-hybrid.c
in your home directory and compile this program
interactively by entering the following commands into a terminal on either Midway1 or Midway2 login nodes:
module load openmpi
mpicc -fopenmp hello-hybrid.c -o hello-hybrid
If you choose hello-hybrid_midway1.sbatch
to submit your job to Midway1, you have to run the above commands on one of M
idway1 login nodes.
Alternatively, you have to run the above commands on one of Midway2 login nodes if you choose
hello-hybrid_midway2.sbatch
to submit your job to Midway2.
The reason that we can run the same commands on both Midway1 and Midway2 login nodes is that we are using the default version of the OpenMPI module which defaults to the system GCC compiler. Please note that the default version of a module on Midway1 and Midway2 could be different. For example, the default version of the OpenMPI module on Midway1 is 1.6 whereas the default version of the OpenMPI module on Midway2 is 2.0.1. It should be possible to use any available MPI compiler to compile and run this example.
An additional option -fopenmp
must be
given to compile a program with OpenMP directives (-openmp
for the Intel
compiler and -mp
for the PGI compiler).
hello-hybrid_midway1.sbatch
is a submission script that can be used to submit a job to Midway1 to run the hello-hybrid
program.
#!/bin/bash
# a sample job submission script to submit a hybrid MPI/OpenMP job to the sandyb
# partition on Midway1 please change the --partition option if you want to use
# another partition on Midway1
# set the job name to hello-hybrid
#SBATCH --job-name=hello-hybrid
# send output to hello-hybrid.out
#SBATCH --output=hello-hybrid.out
# this job requests 4 MPI processes
#SBATCH --ntasks=4
# and request 8 cpus per task for OpenMP threads
#SBATCH --cpus-per-task=8
# this job will run in the sandyb partition on Midway1
#SBATCH --partition=sandyb
# load the openmpi default module
module load openmpi
# set OMP_NUM_THREADS to the number of --cpus-per-task we asked for
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# Run the process with mpirun. Notice -n is not required. mpirun will
# automatically figure out how many processes to run from the slurm options
mpirun ./hello-hybrid
hello-hybrid_midway2.sbatch
is a submission script that can be used to submit a job to Midway2 to run the hello-hybrid
program.
#!/bin/bash
# a sample job submission script to submit a hybrid MPI/OpenMP job to Midway2
# set the job name to hello-hybrid
#SBATCH --job-name=hello-hybrid
# send output to hello-hybrid.out
#SBATCH --output=hello-hybrid.out
# this job requests 4 MPI processes
#SBATCH --ntasks=4
# and request 8 cpus per task for OpenMP threads. On Midway2, you could ask
# for up to 28 cpus per task.
#SBATCH --cpus-per-task=8
# this job will run on Midway2
#SBATCH --partition=broadwl
# this job will run on nodes connected with the EDR interconnec. comment/delete the
# following line if the type of interconnect is not important to you
#SBATH --constraint=edr
# load the openmpi default module
module load openmpi
# set OMP_NUM_THREADS to the number of --cpus-per-task we asked for
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# Run the process with mpirun. Notice -n is not required. mpirun will
# automatically figure out how many processes to run from the slurm options
mpirun ./hello-hybrid
Note: Midway1 and Midway2 have different set of modules. Please make sure you use the correct module name and version when submitting your job to each cluster.
The options are similar to running an MPI job, but with notable additions:
--ntasks=4
specifies the number of MPI processes--cpus-per-task=8
is given to allocate 8 cpus for each task. This number cannot be greater than the number of cores per each node.export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
will set the number of OpenMP threads to the number of requested cores in--cpus-per-task
You can submit hello-hybrid_midway1.sbatch
using the following command from one of Midway1 login nodes to Midway1:
sbatch hello-hybrid_midway1.sbatch
Alternatively, you can submit hello-hybrid_midway2.sbatch
using the following command from one of Midway2 login nodes to Midway2:
sbatch hello-mpi_midway2.sbatch
Here is an example output of this program submitted to the broadwl
partition on Midway2:
Hello from thread 0 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 2 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 3 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 6 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 7 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 4 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 5 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 1 out of 8 from process 1 out of 4 on midway2-0087.rcc.local
Hello from thread 0 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 6 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 1 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 4 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 7 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 5 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 3 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 2 out of 8 from process 3 out of 4 on midway2-0088.rcc.local
Hello from thread 0 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 2 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 5 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 1 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 4 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 6 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 3 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 7 out of 8 from process 0 out of 4 on midway2-0087.rcc.local
Hello from thread 0 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 6 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 1 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 5 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 3 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 2 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 4 out of 8 from process 2 out of 4 on midway2-0087.rcc.local
Hello from thread 7 out of 8 from process 2 out of 4 on midway2-0087.rcc.local