About this Tutorial

This tutorial is intended to show the users how to compile and run a parallel Message Passing Interface (MPI) job on the grid. If you are new to the gLite middleware, please look into the other tutorials on this Wiki here: RmkiGrid

MPI and its implementations

The Message Passing Interface (MPI) is commonly used to handle the communications between tasks in parallel applications. There are two versions of MPI, MPI-1 and MPI-2. Two implementations of MPI-1 (LAM and MPICH) and two implementations of MPI-2 (OPENMPI and MPICH2) are supported. The RMKIGrid site currently supports OPENMPI, version 1.4.0. The development environment for this version is installed on the User Interface (UI) machines.

In the past, running MPI applications on the EGEE infrastructure required significant hand-tuning for each site. This was needed to compensate for sites that did or did not have a shared file system, location of the default scratch space, etc. The current configuration allows jobs to be more portable and allows the user more flexibility.

The increased portability and flexibility is achieved by working around hard-coded constraints from the RB and by off-loading much of the initialisation work to the mpi-start scripts. The mpi-start scripts are developed by the int.eu.grid project and based on the work of the MPI working group that contains members from both int.eu.grid and EGEE.

Using the mpi-start system requires the user to define a wrapper script and a set of hooks. The mpi-start system then handles most of the low-level details of running the MPI job on a particular site.

Wrapper script for =mpi-start

Users typically use a script that sets up paths and other internal settings to initiate the mpi-start processing. The following script (named "mpi-test.sh") is generic and should not need to have significant modifications made to it.


# Pull in the arguments.

# Convert flavor to lowercase for passing to mpi-start.
MPI_FLAVOR_LOWER=`echo $MPI_FLAVOR | tr '[:upper:]' '[:lower:]'`

# Pull out the correct paths for the requested flavor.
eval MPI_PATH=`printenv MPI_${MPI_FLAVOR}_PATH`

# Ensure the prefix is correctly set.  Don't rely on the defaults.

# Touch the executable.  It exist must for the shared file system check.
# If it does not, then mpi-start may try to distribute the executable
# when it shouldn't.

# Setup for mpi-start.
export I2G_MPI_PRE_RUN_HOOK=mpi-hooks.sh
export I2G_MPI_POST_RUN_HOOK=mpi-hooks.sh

# If these are set then you will get more debugging information.

# Invoke mpi-start.

The script first sets up the environment for the chosen flavor of MPI using environment variables supplied by the system administrator. It then defines the executable, arguments, MPI flavor, and location of the hook scripts for mpi-start. The user may optionally ask for more logging information with the verbose and debug environment variables. Lastly, the wrapper invokes mpi-start itself.

Hooks for =mpi-start

The user may write a script that is called before and after the MPI executable is run. The pre-hook can be used, for example, to compile the executable itself or download data. The post-hook can be used to analyze results or to save the results on the grid. Please note that since the UI machines provide the exact same environment as the Worker Nodes (SLC5), you can compile your job on the UI machine with mpicc and submit the binary code to the Grid, it should work without problems.

The following example (named "mpi-hooks.sh") compiles the executable before running it; the post-hook only writes a message to the standard output. A real-world job would likely save the results of the job somewhere on the grid for user retrieval.


# This function will be called before the MPI executable is started.
# You can, for example, compile the executable itself.
pre_run_hook () {

  # Compile the program.
  echo "Compiling ${I2G_MPI_APPLICATION}"

  # Actually compile the program.
  echo $cmd
  if [ ! $? -eq 0 ]; then
    echo "Error compiling program.  Exiting..."
    exit 1

  # Everything's OK.
  echo "Successfully compiled ${I2G_MPI_APPLICATION}"

  return 0

# This function will be called before the MPI executable is finished.
# A typical case for this is to upload the results to a storageelement.
post_run_hook () {

  echo "Executing post hook."
  echo "Finished the post hook."

  return 0

The pre- and post-hooks may be defined in separate files, but the names of the functions must be named exactly "pre_run_hook" and "post_run_hook".

Defining the job and the executable

Running the MPI job itself is not significantly different from running a standard grid job. The user must define a JDL file describing the requirements for the job. An example is:

# mpi-test.jdl
JobType        = "Normal";
NodeNumber      = 16;
Executable     = "mpi-test.sh";
Arguments      = "hello OPENMPI";
StdOutput      = "mpi-test.out";
StdError       = "mpi-test.err";
InputSandbox   = {"mpi-test.sh","mpi-hooks.sh","hello.c"};
OutputSandbox  = {"mpi-test.err","mpi-test.out"};
Requirements =
  Member("MPI-START", other.GlueHostApplicationSoftwareRunTimeEnvironment)
  && Member("OPENMPI", other.GlueHostApplicationSoftwareRunTimeEnvironment)
  && RegExp("kfki.hu",other.GlueCEUniqueID)

The JobType must be "Normal" and the attribute NodeNumber must be defined (16 in this example). Despite the name of the attribute, this attribute defines the number of CPUs required by the job. It is not possible to request more complicated topologies based on nodes and CPUs. Please note that the maximum requestable NodeNumber in the RMKIGrid is 64, but if you need more than 64 slots for a short period of time, you should write a letter to gridadm@rmki.kfki.hu.

This example uses the OpenMPI implementation of the MPI-2 standard. This is the only MPI implementation which is supported on the RMKIGrid site. The JobType attribute must be "Normal" in all cases; it selects for an MPI job in general and not the specific implementation.

All of the files for the above example JDL file have been defined except for the actual MPI program. This is a simple Hello World example written in C. The code is:

/*  hello.c
 *  Simple "Hello World" program in MPI.

#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[]) {

  int numprocs;  /* Number of processors */
  int procnum;   /* Processor number */

  /* Initialize MPI */
  MPI_Init(&argc, &argv);

  /* Find this processor number */
  MPI_Comm_rank(MPI_COMM_WORLD, &procnum);

  /* Find the number of processors */
  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
  printf ("Hello world! from processor %d out of %d\n", procnum, numprocs);

  /* Shut down MPI */
  return 0;

Running the MPI job

Running the MPI job is no different from any other grid job. Use the commands glite-wms-job-submit, glite-wms-job-status, and glite-wms-job-output to submit, check the status, and recover the output of a job.

If the job ran correctly, then the standard output should contain something like the following:

UID     =  hungrid013
HOST    =  grid58
DATE    =  Thu May 13 20:03:18 CEST 2010
VERSION =  0.0.65
mpi-start [INFO   ]: search for scheduler
mpi-start [INFO   ]: activate support for sge
mpi-start [INFO   ]: activate support for openmpi
mpi-start [INFO   ]: call backend MPI implementation
mpi-start [INFO   ]: start program with mpirun
-<START PRE-RUN HOOK>---------------------------------------------------
Compiling /home/hungrid013/globus-tmp.grid58.21047.0/https_3a_2f_2fgrid150.kfki.hu_3a9000_2ftZM_5fSpr9G4oF1xp83sF2ZQ/hello
mpicc -o /home/hungrid013/globus-tmp.grid58.21047.0/https_3a_2f_2fgrid150.kfki.hu_3a9000_2ftZM_5fSpr9G4oF1xp83sF2ZQ/hello /home/hungrid013/globus-tmp.grid58.21047.0/https_3a_2f_2fgrid150.kfki.hu_3a9000_2ftZM_5fSpr9G4oF1xp83sF2ZQ/hello.c
Successfully compiled /home/hungrid013/globus-tmp.grid58.21047.0/https_3a_2f_2fgrid150.kfki.hu_3a9000_2ftZM_5fSpr9G4oF1xp83sF2ZQ/hello
-<STOP PRE-RUN HOOK>----------------------------------------------------
Hello world! from processor 3 out of 16
Hello world! from processor 9 out of 16
Hello world! from processor 7 out of 16
Hello world! from processor 15 out of 16
Hello world! from processor 11 out of 16
Hello world! from processor 5 out of 16
Hello world! from processor 14 out of 16
Hello world! from processor 6 out of 16
Hello world! from processor 12 out of 16
Hello world! from processor 13 out of 16
Hello world! from processor 8 out of 16
Hello world! from processor 4 out of 16
Hello world! from processor 10 out of 16
Hello world! from processor 1 out of 16
Hello world! from processor 0 out of 16
Hello world! from processor 2 out of 16
-<START POST-RUN HOOK>---------------------------------------------------
Executing post hook.
Finished the post hook.
-<STOP POST-RUN HOOK>----------------------------------------------------

-- BenceSomhegyi - 2010-05-14

Topic attachments
I Attachment History Action Size Date Who Comment
Cc hello.c r1 manage 0.6 K 2010-05-14 - 09:47 BenceSomhegyi  
Shsh mpi-hooks.sh r1 manage 0.8 K 2010-05-14 - 09:47 BenceSomhegyi  
Jdljdl mpi-test.jdl r1 manage 0.5 K 2010-05-14 - 09:46 BenceSomhegyi  
Shsh mpi-test.sh r1 manage 1.0 K 2010-05-14 - 09:46 BenceSomhegyi  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2010-05-15 - BenceSomhegyi
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback