Abstract
Time series are a critical component of ecological analysis, used to track changes in biotic and abiotic variables. Information can be extracted from the properties of time series for tasks such as classification, clustering, prediction, and anomaly detection. These common tasks in ecological research rely on the notion of (dis-) similarity which can be determined by using distance measures. A plethora of distance measures have been described in the scientific literature, but many of them have not been introduced to ecologists. Furthermore, little is known about how to select appropriate distance measures and the properties they focus on for time-series related tasks.
Here we describe 16 potentially desirable properties of distance measures, test 42 distance measures for each property, and present an objective method to select appropriate distance measures for any task and ecological dataset. We then demonstrate our selection method by applying it to a set of real-world data on breeding bird populations in the UK. We also discuss ways to overcome some of the difficulties involved in using distance measures to compare time series.
Our real-world population trends exhibit a common challenge for time series comparison: a high level of stochasticity. We demonstrate two different ways of overcoming this challenge, first by selecting distance measures with properties that make them well-suited to comparing noisy time series, and second by applying a smoothing algorithm before selecting appropriate distance measures. In both cases, the distance measures chosen through our selection method are not only fit-for-purpose but are consistent in their rankings of the population trends.
The results of our study should lead to an improved understanding of, and greater scope for, the use of distance measures for comparing time series within ecology, and allow for the answering of new ecological questions.
Competing Interest Statement
The authors have declared no competing interest.