The art of software recycling
Posted on 13 March 2013
The art of software recycling
By Vanesa Magar, Fellows Alumni and Lecturer in Coastal Engineering, University of Plymouth.
It is not uncommon for a researcher to need to modify a piece of software. This can be fairly straightforward, especially when the problem that needs to be solved is similar to the problem that the software was designed for. This is often not the case, and a lot of work (read additions) is needed to modify the software. In this post, I'm going to discuss two examples of code modifications that have recently been performed in my group, and the lessons I learned during this work.
Say, for example, that you have a multigrid code for brine transport problems (VLUGR3), and you want to adapt it for nutrient uptake by swimming microorganisms (such as in Magar and Pedley, Journal Fluid Mechanics, 2005). How easy is it to do this?
First you need to note that the problem is in both cases about transport, so you are solving the time-dependent, convection-diffusion equation, and that the boundary conditions are actually very similar. This means that you need to change to the appropriate velocity field and change the coordinate system in which the equations are expressed. Next, check the platform, language and compiler in which the original code was developed. If you are familiar with all three, and can do your modifications with all three (because you have adapted your computer for the task, you are proficient in that programming language, and you have the right compiler), then you are very likely to be able to implement your modifications with relative ease.
Consider now a different problem. Suppose you have a code that has been developed to analyse sand dune erosion due to storms (XBeach), but you want to use it for gravel beaches and you want to add the effect of tides. This is a lot more complex.
Does gravel respond to storms in the same way as sand? The answer is no, so the physics of XBeach will be wrong unless a number of additional coastal processes are considered and the relevant terms added to the equations. How do you add the impact of tides? In principle, this translates into a rhythmically varying water level, instead of a fixed water level, and this new feature needs to be implemented in the code. If you add more complex physics, the run time of the code will increase unless the original code and the modifications are carefully optimised. As in the previous example, considerations about platform, programming language and compiler have to be taken into account. And last, but not least, if the code is significantly modified, a number of potentially new test cases for model validations will have to be performed to assess the accuracy and the performance of the new code.
Code recycling may save a lot of time, because it means we are not constantly reinventing the wheel. However, we need to tread carefully! This may sound obvious to a software engineer, but it might not be obvious to a research scientist. The most important aspect of code recycling is to consider whether you are familiar with the programming language, especially if it is a low-level language. It is also necessary that you are able to run the code with the same compiler and on the same platform on which the code was developed. If this is not the case, then before you attempt anything you need to learn the basics of the language, development platform and compiler. You will have to become acquainted with the test cases and the applications for which the code has already been used, and make sure that you understand their limits.
I strongly advice against trying to change a code before you have become well aware of the code’s quirks and limitations. Once this has been done, you can start to mould the code so it can provide answers to your research questions. Finally, if the practices described in Best Practices for Scientific Computing paper are followed, software recycling will be much easier, rewarding and, most importantly, correct.