Friday, 25 April 2014

Communicating Uncertainty by Murray Lark

Today Murray Lark, BGS Environmental Statistician, talks about the science of turning complex statistics and number crunched data into useable and helpful information. (oh and if you're attending the European Geosciences Union congress next week make sure you catch our sessions on communicating uncertainty- info at the end). So, over to Murray: 

Not only a town in Texas, USA
As earth scientists we are used to the idea that our data, and the inferences that we make from them, are uncertain because of natural variability and the complexity of the processes that we study.  Statistics provides us with well-honed methods to express much of that uncertainty mathematically, and to examine its consequences.  For example, rather than stating that the concentration of a potentially harmful element in the soil at some location definitely exceeds a regulatory threshold we may estimate the probability that this is the case.  However, we are discovering that a further challenge exists.  How can we effectively communicate this uncertain information to the manager or policy maker who has to make decisions?  Often the state-of-the-art statistical outputs that we can generate are far from clear to the people who most need to understand them.  We also want these users, or potential users, to understand that uncertain information is still useful.
One organization that has tackled this problem head-on is the Intergovernmental Panel on Climate Change (IPCC).  They require scientists to use a verbal scale to provide information about the uncertainty of predictions or estimates which they make.  On this scale one outcome may be “virtually certain”, another may be “about as likely as not” and another may be “unlikely”.  This scale has been studied by psychologists who have made recommendations about how it could be made more effective and consistent.
In work with colleagues from the Geological Survey of Ireland we have recently completed an analysis of some of the Tellus Border geochemical data from the border counties of Ireland.  We analysed data on soil cobalt and manganese content to show where grazing sheep may be at risk from cobalt deficiency.  Sheep need enough cobalt in the grass that they eat to ensure that the microbes in their digestive system can make enough vitamin B12 to keep them healthy.  The supply of cobalt to the sheep is partly dependent on the cobalt content of the pasture soil, and on the manganese content because manganese oxide can bind soil cobalt and prevent plants from taking it up.  We were able to compute local probabilities that there may be a deficiency due to soil conditions, but how can this be communicated effectively?
We used the IPCC scale to define the legend for maps based on our statistical output.  The map above shows how likely it is that a local soil analysis would raise concerns about cobalt deficiency.  The colour scale indicates whether this is "exceptionally unlikely", "virtually certain" or somewhere inbetween.  This uncertainty is partly due to the variability of soil, which means we cannot be certain about the local concentration of cobalt and manganese.  It is also partly due to local conditions, since, other things being equal, we will be most uncertain in regions where the cobalt concentration is in the transition range between that for deficient and non-deficient soils. 
Our map makes use of research about the IPCC scale.  For example, while the verbal scale and colours are the main tool for communication, numbers are there too (which research shows helps to ensure consistent interpretation by different users).
An open access paper which describes this work in more detail can be read at
Readers who will be at the European Geosciences Union congress this year may be interested to attend a session about the communication of uncertain information in earth sciences organized by BGS staff and colleagues.  Our speakers will cover a range of topics including recent psychological research on how uncertainty is perceived and the implications for communication, real-world trials of alternative methods to express uncertain model outcomes, some new statistical ideas and a range of case studies.
Oral session: Thu, 01 May, 13:30–15:15 / Room B5
Posters: Thu, 01 May, 17:30–19:00 / Blue Posters B187 et seq.
And don't miss a Poster Discussion session where poster authors will present briefly on a key point from their poster: Fri, 02 May, 10:30–11:15 / Room B7


1 comment: