To R, or not to R? At the Royal Statistical Society
The Statistical Computing Section of the Royal Statistical Society in London recently organised a one-day meeting to discuss the strengths and limitations of the statistical computing and graphics software, R. The Software Sustainability Institute's Rob Baxter was invited to talk about the Institute's work.
The meeting was a great opportunity to demonstrate the potential of R in diverse areas, but also to recognise its limitations. There are many advantages to using open-source software like R; the first and most obvious being that it can be used by anyone without the need of purchasing a license from proprietary vendors. A further strength is that R is not just a data-analysis package, but also a programming language for creating your own functions or packages. It is highly extensible and, due to its open-source nature, anyone can contribute to its extensibility. A good introduction to R is the R wiki and this blog post.
R's weaknesses begin to show when working with vast amounts of data like those arising from next-generation sequencing studies. A potential limitation when developing your own R package is the dependence that it may have on other packages: maintainers of R packages must ensure that changes in their package do not create errors in its children packages. So, looking at it in the reverse direction: when developing my own R-packages I personally try to limit their dependencies to the minimum so that changes or errors in other packages do not affect me. I try to depend only on the most stable and established packages so that I can be confident that they have been tested thoroughly and they are reliable.
Invited speakers from academia, industry and education shared examples of successful use of R in their work and a few of them also pointed out some of R's weaknesses. The list of invited presenters included R's core contributor and Junior Professor at TU Dortmund University (Germany), Uwe Ligges, who gave an overview of R's conception, development, and future prospects. Wayne Jones, from Shell UK, explained the way in which a multinational like Shell uses R to deliver statistical solutions to clients. Professor Andy Field, from the University of Sussex, shared his experience on teaching Statistics using SPSS and R. Andy has found that students of non-mathematical or non-computational degrees can find working in R intimidating because of its command-line driven nature (although various GUIs exist). Peter Nash and Ernest Turro presented their work as post-docs at Imperial College London and University of Cambridge, respectively, using R. One of the authors of the recently published book R for Dummies, Andrie de Vries, introduced his book and gave an overview of applications of R in market research. Rob Baxter, from the Software Sustainability Institute and University of Edinburgh, introduced the work that the Software Sustainability Institute carries on within the scientific community in the UK, and described some of the top techniques that a software developer can adopt to produce good-quality software. Brandon Whitcher, from Mango Solutions, closed the meeting by presenting examples of medical image analysis in R.
For future events organised by the Royal Statistical Society, please the RSS events page.
Posted by Simon Hettrick on Friday 27 July 2012.