Research Software Group update: software testing and a published paper

Posted by n.chuehong on 18 August 2015 - 5:38pm

By Steve Crouch, Research Software Group lead.

This is the first in a series of blog posts by the Institute's Team Leads to provide an insight into the day-to-day work of the Institute.

The Institute is once again holding its Open Call for Projects, and we're starting to see applications rolling in. So if your project develops research software and you'd like some free expert help, why not consider submitting an application? We work with projects from any discipline, and in the last two months we've helped two groups in the fields of biomolecular simulation and data provenance develop the means to test their software, and had a paper published with one of our projects in the area of biological data visualisation.

The Institute's Research Software Group holds its Open Call about twice a year, and we've just opened the latest round of the call which closes on 30 September 2015. Since 2010, we've worked with over 50 projects to help improve their research software.

Ensuring correctness in scientific codes

We're working with Jonathan Essex's Research Group at Southampton to develop an automated test suite to ensure the correctness of their ProtoMS molecular simulation software, developed in Python and Fortran. This work follows on from our review of their code base, where it's great to hear they've already implemented most of our recommendations!

We're finding that much of our Open Call project work increasingly focuses on ensuring the correctness of scientific codes. This is the fourth most popular area for collaboration requested through our Open Call, with only the areas of development process, maintainability and usability ranking higher. This is something I find heartening, since you can find so many examples out there of research codes leading to erroneous results and even paper retractions.

With other researchers and pharmaceutical industry partners interested in ProtoMS, the automated test suite will help ensure that correctness is maintained whilst the code is developed. This represents an important step towards ProtoMS becoming a more complete software package, and enables the team to validate the software on both GNU, and more recently, Intel Fortran compilers. Based on a testing strategy created by the team, a suite of initial test cases have been developed over the past two months which validate their results against a set of known correct data. Interestingly, the development of the test suite itself has already uncovered some minor assumptions in software deployment that are being rectified.

Ensuring correctness for a different type of software

We're also working with the Web and Internet Science group, also at Southampton, to develop an infrastructure with a different testing focus.

The Southampton Data Provenance Tool Suite is a suite of software, libraries and services to capture, store and visualise provenance compliant with the World Wide Web Consortium (W3C) PROV standards. A core function is its ability to convert documents compliant with a given PROV specification into any of the others, and there is a strong need to ensure these conversions are correct. The test harness runs conversions across every combination of every possible pair of documents within every test case - with 5 different PROV representations, this means a total of 120 conversions per test case. Quite a job to do manually!

So instead of testing the correctness of generated research data, the test infrastructure verifies the correctness of these conversions. An added bonus is that test test harness also has it’s own unit tests, so can also test itself to see if it is operating correctly.

Documenting the community success of BioJS

Last year we also collaborated with the BioJS project, a multi-partner effort coordinated by TGAC, which provides freely available infrastructure, guidelines and tools to represent biological data on the Web. So I’m also very pleased to announce that a paper written by the project that covers their community engagement experiences, to which we contributed, was published by eLife in July.

In 2014 we provided the BioJS project with an evaluation of their software, tools and documentation, which has helped to make adoption and development of BioJS components easier and more consistent.

But around this, the project has had significant success with its community engagement endeavours, going beyond a small set of components into a community of 41 code contributors across four continents, an active Google Groups forum with over 150 members, and 15 published papers. The eLife paper covers the lessons learned by the project to achieve this, and the factors that influenced the growth of the BioJS community, and the team hope that their documented experiences will help other projects to build similarly robust open source projects and communities.

Share this page