Bridging the gap: Convincing researchers with different backgrounds to adopt good (enough) software development practices

Posted by s.aragon on 9 February 2018 - 8:24am

groupc.pngBy Stuart Grieve, Research Software Developer, University College London, Eike Mueller, Lecturer in Scientific Computing, University of Bath, Alexander Morley, DPhil in Neuroscience, University of Oxford, Matt Upson, Data Scientist, Government Digital Service, Richard Adams (Chair), Reader, Cranfield University, Michael Clerx, Post-doctoral researcher in Computational Cardiac Electrophysiology, University of Oxford.

This blog post was motivated by a discussion amongst academics and research software engineers from different disciplines on the challenge of writing good, sustainable software in teams with different backgrounds. Specifically, how can a mixed team of, say, scientists, librarians, engineers and project managers be encouraged to write good software together?

Our discussions led us to two broad recommendations: first, to ensure that research software engineers (RSEs) are firmly embedded in their research groups to improve the flow of information; and second, to highlight the role of champions and ambassadors in spreading good practice. This requires the development of project infrastructure to facilitate and recognise these roles.

So what are the issues?

Multi-disciplinary teams present unique opportunities for science and industry, but also come with their own peculiar difficulties. There can be communication barriers between different disciplines and specialisms, and this can be exacerbated when parties have conflicting, and often unspoken, expectations for the project. For example, research scientists may regard high-quality journal papers as the primary outputs of the project, and see software as ‘merely’ a discardable tool to help achieve these objectives. In this view the quality and sustainability of software is unimportant, as long as the software meets its goal, and any time spent ‘polishing’ it is wasted. Conversely, amongst RSEs and industrial collaborators software assumes a much greater importance and might even be regarded as the primary output of the project (the ‘finished product’). In this view, questions about about software sustainability naturally arise: is it well documented? Easy to maintain? Will it scale to bigger problems? With a focus on these questions, the functionality and fitness-for-purpose for the current project can sometimes seem to take second place, which can puzzle (or even frustrate) researchers focussed on more immediate goals. On the other hand, it can be challenging for RSEs to contribute productively to a project in this environment.

While there are careful trade-offs to be made in these situations, as RSEs we care strongly about code that not only works but is good code: extensively tested software which has a well-documented development trail and is written in a transparent, sustainable way. We are convinced that good code leads to better research, as measured by quantity and quality of papers. In addition, making code (typically created using public money) publically available enhances the impact of the research and greatly increases its value to society.

Frequently, however, in project design, the quality of the resulting software appears not to be regarded as an important consideration. One consequence of this can be insufficient resource allocation to this component of a project which, when combined with the other challenges of working in multidisciplinary teams, can cause confusion, delay and a lack of consensus. This then leads to extra time spent on testing, refactoring, continuous integration and rewriting existing (now outdated) documentation, if good software engineering is to be preserved.  This time may then be perceived as ‘wasted’ by the other stakeholders in the project, which can reinforce the view of good software practices as costly and unnecessary.

Looking sideways: lessons from other sectors

Mixed teams offer unique opportunities for the acquisition of good (enough) software development practices, such as through cross-disciplinary learning. This was felt particularly to be the case where a project involves experts from industry or government, who often have a more systematic approach to software development and see code as an asset in its own right. This raises the question of how lessons from these domains can be brought into the realm of academia.

Take the case of an industrial software project, for example, in which the customer is often involved in planning throughout the project, and has a say on software-related issues. We had the impression that industrial projects tend to adopt more formal (often agile) project management practices, resulting in better scoping and resource use. Industrial projects were considered to adopt better practice in motivating, recognising and rewarding good software development. In contrast, a different model was noted in academia where junior researchers and PhD students are used as a cheap resource to produce software “that’ll do the job”, with little regard to subsequent use. Of course, this assessment requires further investigation but, if supported with evidence, it raises the question of how academic practice can learn lessons from industrial and governmental practice.

Ideas for a solution: good communication practices, starting small, and embedded champions

Having identified conflicting expectations and differences in communication preferences as challenges to adopting good software development practices, our conclusions emphasised the importance of good communications practices, and identified a range of good and exciting examples of successful knowledge transfer between RSEs and scientists.

We felt that it is particularly important to adopt mechanisms that get RSEs more involved in the domain of the research projects of which they are a part. A good example was one RSE who found himself learning an ancient Sumerian language for one project. As a result, they became more engaged in the research and were better able to appreciate the needs of the team.

Clearly, there is a reciprocal to this example, and researchers may benefit from developing their own understanding about the ways in which testing and good software development practices will help them. This is often best motivated when starting with practices with immediate benefits and a low cost of entry such as testing and version control. However, exemplar cases of such practice are difficult to find as the world of software development remains a black box for most academics. Outreach to these communities was seen as critical to the success of addressing the challenge. While it may take time, we believe it will ultimately lead to improved take up of good software development practice and better science.

People often resist being told how to do things by an outsider, and innovation adoption theory warns of the dangers of ‘Not-Invented-Here-Syndrome’—a reluctance to take on ideas developed outside their specialist domain [1,2]. It is therefore very important that RSEs are firmly embedded in the team and involved in all stages of the project. Getting on board in the early phases of a project—or even before the start—is crucial to build trust and to set up pathways for adapting good software development techniques. We discovered that lots of people already have taken on informal roles as champions or ambassadors for good software in their own institutions. Surprisingly, even very simple ways of recognition such as the use of stickers can help to raise a champion’s profile (though they don’t usually appear on CVs). We agreed that this role needs to become much more formalised to get proper recognition. The exact nature of the reward/motivation for an RSE in an academic setting (which measures success in research output = papers) also remains less clear. Recently, journals which are explicitly trying to fill this gap by covering research software development, see e.g. [3,4]. The establishment of central RSE groups at several universities [5] is a big step in the direction of formalising the role of those champions inside research groups and recognising their contributions.

We also agreed that it is particularly important to get senior academics involved, since they can have a direct impact through shaping curricula and motivating their colleagues and students to write better code. While we are cautious of suggesting solutions that add to an already stretched workload, one potential model is to identify current researchers who are already well-versed in research software practices and support them via a formal “Research Software Champions” network within/between institutions. This closely tracks what is currently being piloted at the University of Cambridge with regards to best practices in data management [6]. In terms of practical advice for introducing these practices to academics, many of us have found that introducing small modular “good-enough” practices is more effective than trying to instill a full stack of changes into an individual's workflow at once. In the same breath, we note the importance of recognising that not all academics want to be experts in software engineering but are happy to be introduced to tools that will make their lives easier.

Finally, funding can be an issue since traditionally the only way of adding an RSE to a research grant was as a Co-PI, which can be challenging as RSEs need to be shoehorned into this role as Postdocs. With the establishment of central RSE groups, however, this problem seems to become less pronounced, since now RSEs can be added in their formally defined role to the project, and combining this with a well-written software management plan can strengthen a proposal significantly.

References

[1] Katz, R. and Allen, J. A. (1982). ‘Investigating the not invented here (NIH) syndrome: a look at the performance, tenure and communication patterns of 50 R&D project groups’. R&D Management, 12, 7–19.

[2] Rogers, E. (2003). Diffusion of Innovations, New York, Free Press.

[3] Geoscientific Model Development www.geoscientific-model-development.net/

[4] Archive of Numerical Software http://journals.ub.uni-heidelberg.de/index.php/ans/

[5] RSE groups at various universities, e.g. www.ucl.ac.uk/research-it-services/research-software-development (UCL), www.itservices.manchester.ac.uk/research/services/software/ (Manchester)

[6] Data Champions at the University of Cambridge. https://www.data.cam.ac.uk/intro-data-champions