Is the work of scientific software engineers recognised in academia?

Posted by s.hettrick on 23 April 2012 - 3:47pm

By Ilian Todorov, Advanced Research Computing Group, STFC.

This article represents my personal point of view. It is related to Dirk Gorissen’s blog post “The researcher programmer, a new species?” and discussions from the “Scientific Software Development and Management” group page of LinkedIn, which started after the Software Sustainability Institute’s Collaborations Workshop 2012 (CW12). These discussions pertain to why the software engineer in academia needs recognition.

Software has become a technique of choice for many scientists. It is often considered to be free, but this often means “free to academia”. Somewhere down the line, someone has paid for it. Someone has invested their labour in writing code instructions to implement a scientific methodology of some kind. In most cases, this was a postgraduate (PG) student and/or post-doctoral (PD) researcher attempting to automate and simplify the workflow of their research routines.

Times have changed enormously in the last 20 years. However, for the researcher in academia, one thing has remained constant: their career progression is based on their research performance, as measured by the impact of their research papers in peer-reviewed journals. More papers in high impact journals leads to more success and recognition, and better chances when applying for funding or academic jobs. In contrast, software development has diversified. It has become a well-defined profession with many sub-fields and computer languages. This is not surprising for an industry that governs our lives at home – PCs, games, smart devices, apps – at work and anywhere we go – databases, financial transactions, GPS, industry. It has also become a discipline in its own right in higher education as “informatics” or “computer science”.

Software developers are employed to create code that facilitates scientific research – they contribute to research, but they do not write papers. This means that their position in academia has often been seen in the same light as that of a support scientist or lab technician. Is that a fair comparison? Like scientists, software engineers test their hypotheses and solutions for correctness by subjecting them to well-defined test cases. Like researchers, they keep abreast of the cutting edge of scientific methodologies in order to develop and prove them worthwhile. Like lecturers, they develop pedagogic skills to teach and train PG/PD researchers. Like academics, they can write papers and grant proposals, as well as technical reports and manuals. Unlike most academics, they keep abreast of relevant IT trends – numerical algorithms, computer languages, analysis tools and hardware.

I argue that every scientific software developer is an academic researcher at heart, but not every academic researcher is a scientific software developer. Even though scientific software developers deliver mainly code, it does not mean that they do not contribute to research. Research and scientific software development are neither mutually exclusive, nor fully inclusive activities - they are complementary.

It is not the naming of the new species that defines the problem, but the lack of recognition and appreciation of the work, role and skill of the scientific software engineer in both academia and industry. There is no clear career progression path for scientists who devote themselves to research software development, and commercial software houses do not recognise the skills that developers gain in academia. The lack of career security often makes talented software developers move to industry. The impact of losing a scientific software developer is usually much greater than the loss of a researcher, because I believe that it is much harder to cultivate the unique combination of skills and knowledge required by a developer.

Until recently, software development in academia has been viewed as an uninteresting means for achieving interesting research. Research Councils have had no funding policy for software development and sustainability, which has led to the need to disguise such development in grant proposals. As a consequence, software development has been carried out in a cash-starved environment, where software developers migrate from one project to another. And when a research project ends, so does the work of maintaining the software that it used. In many cases, development focuses on meeting the minimum requirements of a specific scientific research case, rather than investing in software that is more generic, and can be re-used by other research projects. Without other research projects as future stakeholders, Principal Investigators can only extend the life of their software by extending the life of their project through grant renewals. All of this means that it is challenging to produce reusable software that can last beyond the scope of any one project. Some software projects die (GAMESS-UK), some freeze in time (SHELL), and others move overseas, (GULP), leading to what I consider as loss of irreplaceable assets: scientific software engineers and their expertise.

Of course, my views have been exaggerated in order to expose the problems with software development and sustainability in academia, and the limited awareness (and even more limited action) of the universities and research councils in the UK. The reluctance to act comes from the high price to pay for proper software development and support - let alone sustainability - the muddy clarity of IP and ownership of software, and the limited understanding of how to benchmark software developers’ skills. Solutions may come from industry (NAG Ltd., Wolfram Research Ltd., Scenomics Ltd., DEShow Ltd.), where I am convinced that software development is funded by at least by an order of magnitude more than academia. It is worth pointing out the obvious fact that the post-doc salary is much less than that of the professional software developer.

In this article, I have only criticised and not talked about the changes in UK academia and Research Council environments which, although they do not answer the problems outlined in this article, they are trying to address them. Some software development and services have been supported by the Research Councils, such as the CCPs, e-Minerals and programmes such as e-Science. Funding has been provided by EPSRC to application support via the CCPs; to software development support via HEA (Daresbury Laboratory); to software HPC optimisation via distributed computational and software engineering (dCSE) projects serviced by NAG Ltd. For the last decade, EPSRC has generated just one software engineering call in 2010 (which was very weakly defined). However, this year EPSRC announced software development fellowships, which are a step in the right direction. On the university side, there has been some positive activity: computational science and engineering has been established as a course by the EPCC and recognised as a different subject from computer science. HPC and parallel programming training courses are now widespread and are regular events for PhD students in natural sciences.

The situation of scientific software developers in academia still remains pretty dire. The demand for software development has risen, but the availability of scientific software developers has not, due to poor, long-term planning and succession policies. Millions of pounds are spent annually on commodity clusters in UK academia and still it is the computers that are considered to be the commodity rather than the software and the developers...