Software and research: the Institute's Blog

Open-access publishing: trials of the transition

By Dr Robyn A Grant, Lecturer in Comparative Physiology and Behaviour at Manchester Metropolitan University.

I have mixed opinions about open-access publishing. Finding the money to cover open-access publishing is not easy, especially for early career researchers during this transitionary period as open access becomes the norm. Despite the costs, I really believe in open-access publishing. We want our science to be read, surely! Especially in this interdisciplinary era, it is important for non-academic stakeholders (such as patients, consultants, managers, developers, etc.) to have access to our outputs. And, of course, as academics, we are publicly funded, so outputs should be published for all to see.

We do not receive much money to cover the costs of open-access publishing. In fact, my university receives only enough to fund around two open-access publications each year. Don’t worry, I hear you cry, in this open-access era the costs of the library subscriptions to journals will cover your publication costs. However, in this transition period of subscription fees becoming replaced by publishing fees, universities are still subscribing to journals and trying to publish open access, in effect paying twice. If you have a Wellcome Trust or Research Council grant, this will cover your publishing costs, but of course if you are just starting out like me, you might not have a large grant yet. I guess I am just left counting my pennies to try to cover the thousands of pounds it costs to publish my papers under open access - amid rumours that only open-access papers will count in future research assessment exercises.

Something for everyone - resources from the CW15 demo sessions

By Shoaib Sufi, Community Leader

The Collaborations Workshop 2015 (CW15) and Hackday lasted three short - but highly charged - days, and attracted over 85 people to work on interdisciplinary research. As the rich set of resources created at the event are written up, we will be releasing these short posts to share the outcomes and tools with the community.

The CW15 demos covered a vast array of subjects. From systems for creating data management plans (DMPonline), to new ways of packaging software, data, papers and other associated resources to make Research Objects the new index of your research work. Other demos showed off tools to help visualise data sets using Python (DAWN Science), workflow-oriented systems that make it easier to integrate the vast array of web-based datasets using a visual programming paradigm (such as Apache Taverna, and systems for cataloguing data driven experiments (SEEK).

Bioinformatics tools, services and know how had a huge showing, with strong representation  from the Wurmlab. Important approaches to software development for researchers were covered such as the user centric design approach used to develop sequence server. The need to identify and fill in the gaps in software provision in Bioinformatics was highlighted with an exemplar of the need to validate gene predictions and how this lead to the Gene validator software system. Techniques such as crowd sourcing applied to improving the quality of research data offered a novel example of how techniques from other areas could be used to improve the quality of research tools, such as the gene prediction system, Afra).

Round-trip testing for Provenance Tool Suite

Family tree

By Mike Jackson, Software Architect.

Provenance is a well-established concept in arts and archaeology. It is a record of ownership of a work of art or an antique, used as a guide to authenticity or quality. In the digital world, data too can have provenance: information about the people, activities, processes and components, for example software or sensors, that produced the data. This information can be used to assess the quality, reliability and trustworthiness of the data.

Trung Dong Huynh, Luc Moreau and Danius Michaelides of Electronics and Computer Science at the University of Southampton research all aspects of the “provenance cycle”: capture, management, storage, analytics, and representations for end users. As part of their research, they have developed the Southampton Provenance Tool Suite, a suite of software, libraries and services to capture, store and visualise provenance compliant with the World Wide Web Consortium (W3C) PROV standards.

Building Artory: a personalised culture guide

By Christopher Hunt, Lead Developer at i-DAT, Plymouth University.

This article is part of our series: a day in the software life, in which we ask researchers from all disciplines to discuss the tools that make their research possible.

Organisations use a wide range of methods to measure how audiences engage with a cultural experience. These include audience surveys, focus groups, interviews, blog posts and a range of more experimental and creative methods. However, most of these methods are expensive, use a great deal of resources and lack a standardised set of metrics. Often they focus on evaluation data and lose sight of the users’s incentives for leaving this feedback in the first place.

Getting to grips with EPSRC's policy on research data

By Neil Chue Hong, Director.

From 1 May 2015, organisations that receive EPSRC funding, and their researchers, are expected to comply with the EPSRC policy framework on research data. This sets out EPSRC’s principles and expectations concerning the management and provision of access to EPSRC-funded research data, in particular the principle that "research data is a public good produced in the public interest and should be made freely and openly available with as few restrictions as possible in a timely and responsible manner".

Archaeology with open-source software. It's getting easier

By Ben Marwick, Assistant Professor of Archaeology at the University of Washington.

This short post is written for archaeologists who frequently perform common data analysis and visualisation tasks in Excel, SPSS or similar commercial packages. It was motivated by my recent observations at the Society of American Archaeology meeting in San Francisco - the largest annual meeting of archaeologists in the world - where I noticed that the great majority of archaeologists use Excel and SPSS. I wrote this post to describe why those packages might not be the best choices, and explain what one good alternative might be. There’s nothing specifically about archaeology in here, so this post will likely to be relevant to researchers in the social sciences in general. It’s also cross-posted on the Arc-Team Open Research blog to celebrate the inclusion of RStudio in the next release of their custom Linux distribution for archaeologists.

Top tips for running a small workshop

By Stephen Eglen, Software Sustainability Institute Fellow and senior lecturer University of Cambridge.

Late last year, I ran a workshop with the International Neuroinformatics Coordinating Facility (INCF) in Cambridge. It was regarded by all attendees as a success and it was suggested that we archive some tips for organising a small workshop. Here are those tips.

1. Get help with admin

We were incredibly lucky in that all the administration for the event was taken care of by the INCF, and in particular its program officer, Mathew Abrams. Everyone's travel plans were coordinated, and everyone stayed at the same (beautiful) college. Good admin is a fundamental part of a successful event, but it takes a lot of time to do well, so take any help you can get to ensure that your admin is done well.

Going the Distance with natural abundance

By Alexander Hay, Policy & Communications Consultant, talking with Eric Rexstad, University of St. Andrews.

This article is part of our series: Breaking Software Barriers, in which we investigate how our Research Software Group has helped projects improve their research software. If you would like help with your software, let us know.

Abundance is a good thing not just for animals, but also for the researchers studying them. This study is, however, harder than it sounds, which is why it is an area of particular interest for Eric Rexstad, research fellow at the University of St. Andrews' Centre for Research into Ecological and Environmental Modelling

The exact term for this is Distance Sampling, where population numbers of a particular species in a certain area are estimated. For example, "how many harbour porpoises live in the North Sea?" as Eric puts it. Yet this leads onto more complex questions - in particular, how do animal populations react to perturbations or changes in the local environment, such as those caused by pollution or development?

Collaborations Workshop 2015 - an electric mix of people!

By Shoaib Sufi, Community Lead.

The Collaborations Workshop 2015 (CW15) took place last week in Oxford. It brought together an electric and buzzing mix of people with an interest in research software, and was the biggest Collaborations Workshop to date.

An inspiring keynote, a raft of lightning talks, wide ranging discussions, demos and intense hacking, allowed people to explore new ideas and gain advice from experts. With funders, researchers, developers, publishers and managers in attendance, the workshop represented views from every position in academia.

There are many outputs from the CW, which we have made available so that even people who could not attend the event can benefit from the discussions that took place. On the CW website, you can find summaries of the discussions, collaborative ideas, Hackday pitches, slides from the keynote and lightning talks, and the software written during the Hackday. Many of these resources are already available, and more will be available in the coming weeks.

Scientific Data Analysis with Java: DAWN

By Steve Crouch, Devasena Inupakutika, Alun Ashton, Mark Basham and Matthew Gerring

Scientific projects are often created as stand alone applications which use their own definitions for algorithms and visualisation tools. This makes it difficult to benefit from other people's work. The DAWN Science project allowed a large group of scientific developers and software engineers to  collaborate by developing a single, general purpose API to allow access and sharing of existing algorithms and visualisation tools. This significantly accelerates the development of new analysis tools. We reviewed the DAWN code and provided advice on how to improve the organisation of the software and sharing of the code. 

DAWN (Data Analysis WorkbeNch) is open-source scientific data analysis software for numerical data built on the Eclipse/RCP platform. It is developed by a collaboration of facilities and universities, some of whom are contributing code or development effort and others who use and test the software. The collaborative development is led by Diamond Light Source which is situated at the Rutherford Appleton Laboratory Campus near Oxford. Diamond is not restricted to a single scientific domain, so the software must cover a wide range of uses, from specialist capability like calibration and data reduction for diffraction equipment, to general capability like peak fitting and and integrated Python development environment including interactive tools such as plotting.