HomeNews and blogs hub

Save scientific legacy code!

Bookmark this page Bookmarked

Save scientific legacy code!

Author(s)
Aleksandra Pawlik

Aleksandra Pawlik

SSI fellow

Posted on 13 October 2011

Estimated read time: 4 min
Sections in this article
Share on blog/article:
Twitter LinkedIn

Save scientific legacy code!

Posted by s.hettrick on 13 October 2011 - 2:58pm

SaveOurCode.jpgBy Aleksandra Pawlik, one of the institute's Agents.

The maintenance of scientific legacy code gives many scientists (and software engineers) a major headache. Supporting users, adding new functionality and fixing bugs causes problems to accumulate, until it seems easier to abandon the software and develop it again from scratch. Freezing the legacy code for (possibly) a few years of rewriting, means that new contributions have to wait until the rewritten software is released. Is there a solution that enables continuity of legacy software development, yet makes it possible to keep the software up to date and user friendly?

Scientists studying molecular physics found themselves facing this question several years ago. They work on two projects, UKRmol-in and UKRmol-out. Their software, which could be traced as far back as the 1970s, proved to be a useful and sometimes crucial tool. Unfortunately, it had become difficult to use and maintain. It started to look like the software was heading down a blind alley, but a few scientists managed to find a way forward.

The code had a number of contributors from different research institutions - each with different experience, skills and programming practices. There was no comprehensive methodology or guidelines for those who, over the years, had developed the software. In fact, the use of FORTRAN was probably the only shared feature of the contributors’ work. There was also no continuity in the work of the scientist-developers. Contributors were typically PhD students or post-docs, whose main goal was to progress their career, rather than provide sustainable scientific software.

Documentation was scarce, rich and meaningful comments in the source code were rare, and some parts of the software were only documented by hand on a few sheets of paper (these sheets were carefully photocopied for each newbie to the project). Essentially, it was difficult for new users to get the software working without the direct support from someone who already had significant experience with the code.

The scientists applied for an EPSRC grant from the HPC pool. Their ultimate goal was to add substantial new physics to the software. But first, they argued that it would be essential to turn the software into a modern, efficient and sustainable package. This sustainability argument convinced the EPSRC who allocated the funding.

Solid re-engineering and a professional approach to code development were the first steps to take. The Principal Investigators employed a full-time postdoc, whose main task was to clean up and refactor the legacy code, rather than focus on increasing the project’s number of publications. A coding standards document was prepared, which meant that anyone who contributed to the code had to adhere to prescribed standards. The SPAG tool for FORTRAN Code Restructuring was used to address the issues with the spaghetti code written in early FORTRAN versions. The project was moved to CCPForge, which enabled the team to take control over the source code via the incorporated SVN repository. Moving to CCPForge also improved communication between members of the development team and the software’s users, by providing access to a forum and mailing lists. It also allowed the documentation, which is still being developed and updated, to be stored in a single location. The scientists who develop the software are not co-located, so a regular team meeting was introduced, during which the team discusses plans for development and reports on progress.

The re-engineering of this software is still in progress. Among the benefits is not only the updated software, but also the best practices in scientific software development which can now become a benchmark for the wider scientific community.
 

Share on blog/article:
Twitter LinkedIn