Sensitive numbers

22 03 2016
toondoo.com

A sensitive parameter

You couldn’t really do ecology if you didn’t know how to construct even the most basic mathematical model — even a simple regression is a model (the non-random relationship of some variable to another). The good thing about even these simple models is that it is fairly straightforward to interpret the ‘strength’ of the relationship, in other words, how much variation in one thing can be explained by variation in another. Provided the relationship is real (not random), and provided there is at least some indirect causation implied (i.e., it is not just a spurious coincidence), then there are many simple statistics that quantify this strength — in the case of our simple regression, the coefficient of determination (R2) statistic is a usually a good approximation of this.

In the case of more complex multivariate correlation models, then sometimes the coefficient of determination is insufficient, in which case you might need to rely on statistics such as the proportion of deviance explained, or the marginal and/or conditional variance explained.

When you go beyond this correlative model approach and start constructing more mechanistic models that emulate ecological phenomena from the bottom-up, things get a little more complicated when it comes to quantifying the strength of relationships. Perhaps the most well-known category of such mechanistic models is the humble population viability analysis, abbreviated to PVA§.

Let’s take the simple case of a four-parameter population model we could use to project population size over the next 10 years for an endangered species that we’re introducing to a new habitat. We’ll assume that we have the following information: the size of the founding (introduced) population (n), the juvenile survival rate (Sj, proportion juveniles surviving from birth to the first year), the adult survival rate (Sa, the annual rate of surviving adults to year 1 to maximum longevity), and the fertility rate of mature females (m, number of offspring born per female per reproductive cycle). Each one of these parameters has an associated uncertainty (ε) that combines both measurement error and environmental variation.

If we just took the mean value of each of these three demographic rates (survivals and fertility) and project a founding population of = 10 individuals for 1o years into the future, we would have a single, deterministic estimate of the average outcome of introducing 10 individuals. As we already know, however, the variability, or stochasticity, is more important than the average outcome, because uncertainty in the parameter values (ε) will mean that a non-negligible number of model iterations will result in the extinction of the introduced population. This is something that most conservationists will obviously want to minimise.

So each time we run an iteration of the model, and generally for each breeding interval (most often 1 year at a time), we choose (based on some random-sampling regime) a different value for each parameter. This will give us a distribution of outcomes after the 10-year projection. Let’s say we did 1000 iterations like this; taking the number of times that the population went extinct over these iterations would provide us with an estimate of the population’s extinction probability over that interval. Of course, we would probably also vary the size of the founding population (say, between 10 and 100), to see at what point the extinction probability became acceptably low for managers (i.e., as close to zero as possible), but not unacceptably high that it would be too laborious or expensive to introduce that many individuals.

The princess is most sensitive to variation in the size of the pea.

The quality of the princess’s sleep is most sensitive to variation in the size of the pea.

So far so good — the outcome (probability of extinction) is a useful guide to maximise the probability of introduction success. But what if we want to determine how sensitive this probability of extinction is to change in the model’s parameters? For example, even though we can most easily vary the size of the founding population, we might also be able to influence survival probability by, say, controlling predators in at the introduction site. Or, we could try supplementary feeding to increase the number of offspring that the average female produced per breeding cycle. The question now is whether spending the time, money and effort to influence one parameter is more important on the outcome (extinction probability) than influencing another. In our case, we can ask whether supplementary feeding is more important than predator control, or whether these interventions are negligible compared to introducing more individuals in the first place.

And so sensitivity analysis was born to solve just this sort of problem.

If you have never before done a sensitivity analysis, I wager you can imagine how it might proceed. The simplest way is known as a single-parameter perturbation analysis where we do just that — vary the value of one parameter while keeping those of all the others fixed, and then relating (correlating) the variation in that parameter to variation in the outcome (extinction risk).

This might sound reasonable, but the problem is that the complex universe represented by our admittedly simplistic model is rendered even less realistic by this approach. There are probably few cases where only one parameter varies while all the others keep more or less the same (fixed) values. In reality, parameters often co-vary in complex, sometimes non-linear ways, so that you get a misleading estimate of the relationship of the variation in the outcome to variation in just one parameter. Thus, global sensitivity analyses were created.

Put simply, a global sensitivity analysis varies all (or at least, the main) parameters in a model simultaneously according to the linkages defined explicitly between them in the model. This gets around the covariation and non-linearity issues. The variation in the output over all iterations can then be related to the iteration-specific values of each parameter within a multivariate correlation model (we call this step emulation, and the model used to emulate, the emulator).

The cleverest among you will now be thinking: “Hang on a minute, what if your model has many more parameters than four? Won’t there by a stupidly large number of parameter values and outcomes to test?”. You’re correct — if your model had, say, 20 parameters or more, you can understand that you will have an exponentially increasing and intractably large number of possible combinations of parameter values to test.

The real question then is how to trade off the number of iterations per set of parameter combinations with an adequate sampling of the parameter space (i.e., the full range of plausible values for each parameter).

Because we have had to deal with this problem many times without an obvious solution, our even cleverer postdoctoral fellow, Thomas Prowse, has just published a paper in Ecosphere where we show that the most important thing to do is an adequate sampling of the parameter space rather than iterating each combination many times. In fact, you can usually get away with a single iteration per parameter value set!

This counter-intuitive result means that with a sufficient parameter-space sampler (such as a Latin hypercube algorithm), you can really streamline your global sensitivity analysis and find which parameters most influence your model predictions. We also provide some R code to help you along with your own analyses. With this nice validation of the approach, streamlining efficient sensitivity analyses for ecological models has become a lot simpler.

§The popularity of PVA in conservation biology justifies my use of this initialism here.

CJA Bradshaw


Actions

Information

4 responses

16 03 2019
Modelling for Empirical Scientists Introduction: Our ‘functional’ world – The Bonser Lab

[…] “You couldn’t really do ecology if you didn’t know how to construct even the most basic mathematical model — even a simple regression is a model (the non-random relationship of some variable to another).” — CJA Bradshaw […]

Like

25 05 2016
Graeme

While I agree with first sentence it doesn’t seem everbody else does. Occasionally I come across papers which use AIC as a measure of goodness of fit, ie. absolute model fit instead of relative. And sometines these are published in Austral Ecology. So what you say? Write a critique and discuss it. But AE doesn’t accept critiques and so the error lives on. What does this say about the state of biological science in this country? Perhaps Australian biologists are too sensitive.

Like

22 03 2016
Messi

Great article, and one I’ve emailed to a few friends working in conservation organisations in the UK. I worked for a while for a large UK bird conservation organisation, which is part of a global network of bird conservation organisations. The UK organisation employs Conservation Officers around the UK, their major role being to deal with casework – i.e. development projects that might adversely affect sites and species of conservation concern. They spend an awful lot of time dealing with offshore and onshore wind farm developments, which the organisation is in principle in favour of, provided it can be shown that a particular wind farm isn’t going to damage biodiversity. In terms of birds, a key issue is collision risk, and whether collision rates might cause populations of protected species to decline or fail to recover. To work this out requires good quality data on bird movements etc, and pretty sophisticated population modelling. This data collection and modelling is done by ecological consultants working for the developer, and the Conservation Officers need to be able to critique their methods and advise on how to improve things (get better data, build models that produce meaningful results and give an acceptable level of confidence). And that requires a good grounding in mathematical population ecology. I know this because I did just this job for eight years and had no such training or skill! I left the organisation and did an MSc in Applied Ecology at the University of East Anglia, and was astonished that the population modelling unit was optional, not mandatory. Few of my fellow students chose that module, and most were as deficient in basic maths as I am! To my mind, being someone who’s terrible at basic maths, I feel that all those involved in defending sites and species of conservation concern from damaging development, or in charge of species recovery projects, should have a firm mathematics grounding – else they just can’t ask the right questions and have confidence in their advice. If universities are failing to send out students with mathematical skills, then the conservation organisations that employ them aught to offer basic training in mathematical ecology.

Liked by 1 person

22 03 2016
CJAB

Well said

Like

Leave a comment