Statistical software in Ecology: Impressions from ISEC 2022
Posted on 4 October 2022
Statistical software in Ecology: Impressions from ISEC 2022
By SSI Fellow Philipp Boersch-Supan.
Understanding the dynamics of ecosystems is a key requirement for the management and conservation of nature. Statistical ecology is the discipline that tries to relate biological questions such as “How many animals are there?” or “Where does a species live?” to quantities that can be observed by field biologists and other stakeholders. In practice, statistical ecologists design and implement statistical models that aim to obtain estimates of abundance or distribution that can then be used to make inferences about the biological mechanisms that drive the system under question.
Bridging fields
Although ecological statistics tend to feature at all ecology conferences in one way or another, the International Statistical Ecology Conference (ISEC) is the main international gathering of the field. One of the most important aspects of ISEC is that it brings together statisticians and ecologists, thus bridging two fields that are in many places still separated by a gap created by different skill sets and research cultures. The conference series has been run every other year since 2008 and now features a few hundred delegates, and as a researcher that works on the interface between ecology, applied statistics and software development, it has been the main conference on my “circuit” for almost a decade now. Presentations at ISEC tend to cover a broad range of statistical methods, and their application to ecological problems. Because of this analytical and applied focus almost every talk and poster draws on research software in one way or another.
The dominant software ecosystem in statistical ecology is that of R, an open-source free software environment for statistical computing and graphics. Custom statistical models are a common theme at ISEC and often authors will provide their implementation as R scripts or packages, so that other researchers can reuse them. While many of the model formulations are custom, estimation of their parameters generally relies on established numerical algorithms. These are rarely implemented from scratch, instead typical application packages are built on top of more generic R packages or backends that provide robust and efficient implementations of these algorithms, typically using compiled, such as TMB, NIMBLE and Stan. However, the complexity of many ecological models is a driver in the development of efficient backends, and many projects are either rooted in the ecological modelling community or receive contributions from it. As such, development of the backends themselves was also a feature at the conference, with several talks on new features in the NIMBLE package which offers C++ templates for a number of parameter estimation procedures.
Presenting at ISEC
My own work routinely relies on the functionality provided by such backends, and this was also the case for ongoing work I presented myself at ISEC. Over the last couple of years I have been developing moultmcmc, an R package that facilitates the estimation of statistical models for feather moult data. Moult - the regular replacement of bird’s feathers - is an important but understudied process in ornithology. This is because the progress of moult in free living birds is difficult to observe, and observational data are difficult to model using standard regression models. I recast an existing implementation of a modelling framework for such data to make use of Bayesian estimation, which made it easier for me to extend these models to better accommodate some features of real world datasets. Behind an R interface I rely on statistical models that are written in the Stan language, which is translated to C++ and compiled when the R package is built, offering reasonably fast model estimation to end users. Presenting this package in Cape Town came with a great bonus: Existing software for moult models originated at the University of Cape Town, and the conference gave me the opportunity to meet the developer of the original R package moult for this class of models.
The conference also featured software focussed workshops and tutorials around particular software frameworks, and on topics like defensive programming or deep learning, which is increasingly applied to classification tasks in ecology, such as identifying species from sound recordings or imagery.
For more information about the meeting visit the ISEC 2022 website or get in touch with SSI Fellow Philipp Boersch-Supan. The next iteration of ISEC will be hosted at Swansea University (UK) in July 2024.