News and Blog

EPSRC CDT in Next Generation Computational Modelling

Lightweight Verification For Computational Science Models

Testing of code can take lot of effort.

Testing of code can take lot of effort.

In late November 2016, the NGCM was pleased to host Dr Dominic Orchard from the University of Kent, to present his views on the scientific software verification.

As in any other activity involving human factors, also in programming we are likely to make mistakes. “The average bug-rate in industry software is 15-50 errors per 1000 lines”, Dr Orchard states, and adds a horror story about the unsuccessful Mars Climate Orbiter spacecraft, which ended up disintegrating (burning in flames and falling to the Mars surface). A mismatch in the physical units used in calculations in two pieces of its navigation software resulted in over 300 million dollars worth of losses, and the work of many researchers came to nothing. Similar kinds of bugs, with larger or smaller impact on our budget, reputation or health, force the important question: “How can we ensure that our code is correct?”

The answer to this question is not trivial, and Dr Orchard helps us understand it by classifying the following two approaches to software testing.

  • The first approach used by software developers we can call simply as Unit or Integration Testing, where the code is tested on a set of artificial problems with known solutions, asserting the correctness of the outputs.
  • The second approach is the so called Formal Verification, where a set of specific rules is defined, and the conformity of the code to this so called specifications is checked by the Formal Verification tool, in a similar way as the conformity of our emails to the grammar rules is checked by the spellchecker. A typical example of such Formal Verification is specifying the data types expected on the inputs to our functions, raising an error if the received data types don’t conform.

The main issue with both of these approaches is that, as indicated on the title image, it often takes much more time to construct a proper set of tests and specifications than the time we need to write the code itself. This is what makes researchers and programmers avoid this tedious but important task, and even Dr Orchard is ashamed to admit: “I don’t always do it too, it’s just too much effort.”

One way which can help us to deal with this inevitable task of asserting the code correctness, and to improve our chance to write a reliable code, is the so called Lightweight Verification, which is the particular interest of Dr Orchard. The idea behind the Lightweight Verification is to write up the specifications only for one “very specific” aspect that is critical to the calculations we perform in our code, and is known to cause lot of bugs in scientific software.

Dr Orchard with a team of researchers from the Cambridge University are developing a Lightweight Verification tool for the Fortran programming language called CamFort, freely available on GitHub. In the current version, CamFort supports Units-of-measure typing and Stencil specification as the specifications it can check the conformity to. To demonstrate a stencil, where “it is too easy to mess up an index”, Dr Orchard provides the following code snippet for computing fluid flow simulation.

Stencil of the discretized Navier-Stokes equations.

Stencil of the discretized Navier-Stokes equations.

He suggests that using the CamFort stencil specification, we can easily set up the bounds for maximum and minimum indexes in our stencil, and also specify rules for illegal repetition of index calls in the stencil. He also notes that in the future releases of CamFort, a possibility to represent the stencil graphically might be of a great benefit for the users.

The idea behind the Unit-of-measure typing comes from the dimensional analysis known from physics and engineering. If together with definition of variables in our code we define their physical units, CamFort is able to verify whether our calculations “make sense”, and that we are not just adding 'apples and oranges'.

Both of these Lightweight specifications take relatively little extra effort to write up, and CamFort performs the 'spellchecking' throughout the whole code automatically for you. They seem to have a great potential to enjoy popularity among a wide range of scientists writing code to perform calculations in their research, as they sparked a considerable interest among the NGCM students attending the seminar.

In the conclusion, Dr Orchard expresses a wish for the software verification tools like CamFort to “bridge the chasm” between the computer scientist focusing only on the code development and the natural scientists using these codes, by applying the physical aspects of the problem into the computational software, resulting in more correct and reliable research.

For the future development of CamFort, Dr Orchard and his team plan to extend its functionality to other types of specifications, which will prove to have a significant impact on the bug rate in software. To help him in search for these, or to bring any new ideas to the project, he invited the students to contact him, offering also the possibility of arranging a summer project for the more interested ones.