Overcoming barriers to adopting software best practices in research

Posted by s.aragon on 7 December 2017 - 9:33am

By Alys Brett, UK Atomic Energy Authority, Sam Cox, University of Leicester, Carina Haupt, German Aerospace Center (DLR), and Jason Maassen, Netherlands eScience Center.

When we talk in general terms about software development practices most people will nod along, maybe slightly nervously. From experience, we know that it can be hard for some commonly accepted good practices to gain traction or be sustained after the initial enthusiasm. However, this can often be overcome if standard approaches are adapted to better fit within a research context.

What are the barriers?

The research sphere is very varied, but there are a number of recurring barriers to adoption of best practice in the research context.

The career histories of researchers create a wide range of skill levels - while some researchers are already used to best practice in a number of areas, others have a very basic level of experience. While in a corporate context most team members may be at a similar level due to shared or similar backgrounds, providing training in research is often more difficult due to this lack of common ground. The varied requirements of the final project can also make it difficult to state clear guidelines.

Personal motivation can be an issue, as adopting best practice often requires a non-trivial upfront cost in terms of time or effort. In particular, researchers may not see a direct link between this effort and future papers, funding opportunities, or increased status within their field. Personal perception of the “status” of various roles, such as software engineering, may reduce motivation to invest in these practices.

The lack of a software engineering peer group can be a further barrier. When working in a large team, each individual member can benefit from the expertise of the other team members, and the entire team learns from the mistakes of the individual members. Researchers, however, often develop software on their own and need to acquire all expertise alone. Isolated researchers learning first hand from only their own mistakes can take many years.

Researchers may not receive sufficient resources in terms of support, time, or institutional permission (either actual or perceived) to dedicate to adopting new best practices that they do not already use. Institutions often judge researcher performance on numbers of publications or grants, not software quality.

It is important to remember that many experts may interact with the software world only by necessity. By moving outside of their chosen field the researcher is immediately in an unfamiliar scenario in which they are no longer the expert, and this can be uncomfortable or embarrassing. The use of tech jargon alone can be off-putting to researchers.

Finally, high levels of expectation can be off-putting and industry standards can actually act as a deterrent to starting out. Industry standards sometimes go beyond what is necessary in a research context, and this can prevent initial progress. The definition of “good enough” varies with the context.

There are some cases where the good practices themselves should translate directly into a research context, for example the use of version control, but the way they are introduced and supported needs to be adapted. There are other cases where something commonly regarded as an “industry standard” does not work well for the typical research software developer.

SCRUM is a good example for a best practice many domain researchers know of and consider a good idea to improve their software development. But the reality is that most of the time the SCRUM process does not match into the research environment.

The most basic requirements of SCRUM are that you should have at least five team members to cover all roles and that developers should be interchangeable.

The first requirement is only matched by a few research software projects, especially if you exclude PhD students. The second requirement is also problematic in two ways. Firstly, PhD students are generally assigned a strict task in a project and are not available to work on the next open task. Secondly, and most importantly, researchers are experts in their very individual field. And they are most likely not able to process the next open task of a research project, since this task probably requires highly specialised knowledge.

Instead of reaching for the newest development trend, the project environment has to be analysed and an adapted process introduced. The basic ideas of agile development are compatible to research projects and should be used here, including allowing changes to requirements, always having working code, and using short release cycles. Choosing the right process is no easy task and requires some level of experience, but by asking for help this can be achieved. It is a step-by-step process to learn and steadily adapt to the ever changing world of research environments.

Documentation is one area where the barrier does not lie in the the practice itself but in the way it is introduced or adopted in various research software contexts. A lot of the barriers discussed earlier are present here: industry standard expectations and jargon can scare people away from taking sensible first steps. This is a shame, because the benefits of going from no documentation to capturing some basics are immense. Researchers asked to document their software often fear that it will take up huge amounts of their precious time and that complex tools and diagrams will be required.

One solution is to use a “maturity model” approach where the people writing the software are encouraged to identify where they are at each stage and to take appropriate next steps if there is a need to increase the level. For documentation this could start with some notes in a readme file to make sure others know how to build and run the software. A further level might be some structured documentation for different audiences in a form that is easily accessible and convenient to keep up to date, perhaps including some basic diagrams or flowcharts. More advanced levels would involve introducing tools for generating documentation, UML diagrams etc.

As a general concept: start small and then go as far as necessary. Reaching for the perfect software development approach is intimidating and overwhelming, and it is not the task of a researcher nor necessary for most research projects. A maturity model can help researchers identify where they are and where they should be, but the model alone is not enough. Support elements like a simple flowchart for maturity level selection, or checklists for analysing the status of a project, can help to lower the entry barrier. Restricting the use of tech jargon to a minimum and offering explanations where necessary can help, too.

The mentality of the research field also has to be taken in account. For example, if you number the maturity levels, those numbers suggest a level of quality., and people who are used to reaching for excellence will try to reach the highest level instead of the appropriate one. Labels for the maturity levels should therefore be carefully chosen. There are many reasons why researchers do not adopt software best practices, and while some may seem to be personal issues they are ultimately caused by the environment researchers work in.

The truth is that overcoming these barriers is hard and researchers need help, not only to overcome the barriers but to figure out what their goal is. Often, the answer is not taking over best practices, but finding out which practices work best for them.