News and Blog

EPSRC CDT in Next Generation Computational Modelling

Message-Passing with MPI

The Archer Supercomputer managed by the EPCC at the University of Edinburgh

Dr David Henty, a member of the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh and the programme director of the MSc in High Performance Computing/High Performance computing with Data Science, ran a 3 day training course at the University of Southampton on the use of MPI to write parallel programs making use of the message passing interface. This course had 31 participants, which were from the Universities of Oxford, Belfast, York and many more, as well as from the UK Defence Academy. The material covered during this course is available here. This course covered how to use the MPI library calls to create a parallel program and how to analyse the performance of this parallel program and was of particular interest to those interested in making use of the power provided by HPC systems such as the UK's national supercomputer Archer, which was used by the course participants in practical exercises, or systems such as Southampton University's own supercomputer Iridis.

Message Passing Interface (MPI) is a portable message-passing system for producing programs which run on multiple processors in parallel, it is commonly used in High Performance Computing (HPC) systems to parallelise programs over multiple processors. It is provides a number of standardised library routines which can be used to write message-passing programs in C, C++ and FORTRAN. In the message-passing model the tasks are separate processes that communicate and synchronise by explicitly sending each other messages. All these parallel operations are performed via calls to some message-passing interface that is entirely responsible for interfacing with the physical communication network linking the actual processors together.

Some C code using MPI

The idea behind MPI is to have several processes running independently of one another which are running copies of the same program and using their unique process ID in order to have the processes compute different things. They communicate via message-passing in order to share information. To give a basic idea of how you would use message-passing parallel programming to speed up some computation I'll discuss the simple example of computing the sum of a very large array. You could do this in serial on one processor or you could use 4 processors by splitting the array into quarters and giving a quarter to each processor. You could do this using MPI by reading in the array on one processor, splitting the array up into quarters and then using a message to send 3 of the quarters to other processors, each processor can then sum their quarters and each send a message back to a single processor to sum these partial sums and output the total sum.

An interesting thing to consider is whether doing this would speed up your computation and by how much: There is some overhead in initialising MPI and passing messages between processors, it takes some amount of time for the message to be sent, this time depends on the specific system you are using. If your array had 4 elements then the overhead of sending a single number to each process and then collecting them would mean that this computation would actually take quite a bit longer, if your array has 1 billion elements then the time associated with sending the message is likely insignificant.

The interesting point is between these 2 extremes; for an array with several thousand elements will the overhead of initialising MPI and sending the messages overcome the speed-up of parallelising the computation? This is not trivial to answer, it depends on how long the computation takes on each processor and how long the processors take to communicate. Both of these depend on the system being used and can very significantly on different systems. For this reason it is important to test the scalability and speed-up of your code on the system of interest to find the point where the amount of speed-up you get for adding another processor is negligible.

If you are interested in attending this course or similar courses it is run multiple times a year by the EPCC, you can click here in order to browse upcoming courses provided by the EPCC team that manage Archer.