Identifying and overcoming obstacles to adopting code review

Posted by j.laird on 17 August 2022 - 1:32pm hurdles

Photo by Interactive Sports on Unsplash

By Hannah Williams, Matthew Bluteau, Catherine Smith, Nadine Spychala and Katy Brown.

This blog post is part of our Collaborations Workshop 2022 speed blog series.

Code review can provide useful feedback for a software developer at any stage of their career and in any working context, be it industry or academia. However, routine code review is not typically part of research projects in academic contexts. While the reasons for this may be manifold, the focus of this blog post is on four of the barriers to be considered when introducing code review into a research software context: lack of willingness to let one’s own code be reviewed; lack of code reviewers to do the job; lack of concrete goals for code review; and lack of time. To improve the adoption of code review in research, we need a cultural change, both from the top down and the bottom up, and to identify and work on the obstacles that currently make code review hard.

Willingness

One of the first barriers to code review is willingness to participate. Code review will only become the norm if people:

are happy to subject their code (and that of other members of their group) to review,
review code for others,
and, as Principal Investigators (PIs) or group leaders, encourage code review as part of the normal day-to-day job.

This requires researchers at all levels, from senior PIs to students, to be on board.

For many early career researchers, a reluctant PI can be a particularly difficult barrier to overcome. To some considerable extent, the change needed to combat this has to come from the top down, with policies and incentives from funders and institutions pushing senior staff to encourage practices such as code review within their team. Providing resources and examples to help explain the value of code review in terms of accuracy of results, usability and maintainability of code could help incentivise PIs to support code review. Grassroots level work from junior researchers also plays a huge role in normalising this practice and bringing change from the bottom up. We’re hopeful that we’re currently in a transitional phase where current early career researchers progressing through the system will encourage good practices, such as thorough code review, in the future.

Willingness to participate in code review also requires researchers to be okay with others looking critically at their code - which can be daunting. Normalising showing code to peers, such as comparing approaches to short coding challenges, can be a good step towards feeling comfortable with code review. Including inexperienced coders as reviewers, as well as reviewees, can also help code review become standard practice within a group - any reviewer at any level can provide valuable feedback and can learn from conducting a review. To find out more about how to do a review of someone else’s code, this blog post from Ariel Rokem is a good first introduction.

Finding a Reviewer

Code review is inherently a multi-party activity, so it requires researchers to find someone else willing to volunteer their own time. This can be a difficult step in the process for some of the reasons just discussed and the fact that code review is not part of the research process as it currently exists. Therefore, requesting code review is tantamount to asking someone to do something above their “expected” obligations as a researcher, and it is an activity that is not directly recognised or rewarded. Acknowledging that you are asking a potential reviewer for a favour will set the correct tone for all future interactions.

The obvious place to look for a code reviewer is within your own research group, but this can be fraught with hurdles. If you are the only code developer in your group, then no one else will be able to give the feedback you require, or perhaps your PI is hostile to any activities that do not directly produce research outputs, creating a climate of hesitancy to engage in code review amongst your colleagues. Furthermore, there is the risk that only having reviewers from your group will produce a “group think” scenario in which better or cleaner solutions might be overlooked. Nevertheless, researchers from your own group are a great potential pool of code reviewers that should be explored.

Looking outside of your research group to find a reviewer may potentially be your best option. Again, there are some hurdles you may need to overcome. The first is to know where to look for someone who might be suited to commenting on your code. Central RSE groups can be a great resource, but they are still comparatively rare, and, anecdotally, they might not have the structures in place to handle the short term work that code review requires. You can check out a more extensive list of places to find reviewers on the Research Code Review Community website, under Step 1. One of the authors (MB) was involved in a working group to craft these guidelines and they are a good resource for further reading after this blog post. The online sustainability evaluation or Research Software Health Check provided by the Software Sustainability Institute might be further helpful resources particularly focusing on issues that affect the sustainability of your software.

In the end, it is worthwhile spending time to find a suitable reviewer, but don’t spend too much time trying to find the perfect reviewer. Someone who has a proficient knowledge of your code’s programming language will undoubtedly have something valuable to contribute in a code review.

Clarifying goals of the review

One can benefit most from code review, if clear objectives are established beforehand: do I need a major refactoring? To enhance readability? To include additional features? Do I need code review for a section or the code in its entirety? Before asking for a code review, it’s good for the reviewee to communicate what they think is needed. If this is difficult to determine, it can also be part of a reviewer’s job to clarify those goals in the first place together with the reviewee.

While code review of the entire code architecture can be very useful for one’s own learning and improving code quality, it is also very time consuming, for both the reviewer and the reviewee. One possibility to be more time-efficient is to benefit from RSE input regarding architecture design at an early stage rather than this being done by researchers alone.

Apart from the technical aspects of writing good code, one also has to make sure it does what it is supposed to do from a content point-of-view. This requires both deep knowledge of code and of domain. Including proper testing in the code as well as peer review to obtain feedback from domain experts can help increase certainty about meeting objectives.

Conclusion

Looking for and receiving code review is a time-consuming activity - it requires personal resources to overcome the lack of external incentives, find a suitable reviewer, formulate a clear request, and incorporate the feedback. Offering code review to someone else requires additional time.

But the benefits will pay out - making errors in writing code for scientific data analysis is normal, and while some of those errors might be small, and thus won’t substantially affect the results that are reported, others will have much bigger consequences one would like to avoid. Code review can therefore help increase the accuracy of results, and furthermore improve usability and maintainability of code, as well as providing a great opportunity to learn. This is especially helpful for those who have not received formal training in software engineering, but for whom writing a significant amount of software is a normal part of their work.