Super-Charging How We Solve Geological Problems // Mark Woods et al

We (Mark Woods, Dan Condon, Rachel Heaven and Phil Wilby) are a diverse group of geoscientists in BGS with a shared passion for developing novel systems for discovering and linking data and ideas that will allow us to more easily assess the variation in subsurface properties – both observed and predicted - due to each region’s unique geological history. With expertise in Palaeontology, Geochronology, Biostratigraphy, Geophysics, Informatics and Data Technology, we explore ways in which the science of Stratigraphy can provide a common framework for organising and exploring geological information.

The classification hierarchies used for subdividing geological successions can be repurposed as frameworks for visualising
variation in a multitude of types of geological data

Fundamental to many aspects of geological work are questions about the subdivision of geological successions; their age and correlation with successions elsewhere; the physical properties of the rocks that comprise these successions, and the processes and environments that shaped them. In common with many other areas of science, a central and time-consuming part of developing our understanding involves searches for existing data relevant to answering our research questions, and reading lots of papers to understand how the data has previously been understood – and potentially misinterpreted.

With the emergence of technology that can rapidly and inexpensively process and analyse large data volumes, we are now able to imagine a future where the time it takes us to understand a problem or perceive data patterns will be much faster – and that in turn will shorten the time it takes to derive interpretations. With the exponential rise in the sheer quantity of data and ideas, and downward pressure on research budgets, we cannot afford not to change the way we work.
Traditional methods for collating data and ideas from digital and hardcopy formats is time consuming and may not provide a comprehensive overview of all the information that is available 

Data mining promises to unlock the universe of ideas and data contained in geological publications, almost completely unrepresented in traditional corporate data systems. This is particularly relevant for BGS with its huge archive of geological memoirs, maps, reports and papers. This is a big deal. It frees us from a lot of cognitive effort searching for and deciding what is relevant and what is not relevant, and allows more time to be spent developing and testing new ideas.

Combining data-mining with the science of stratigraphy will allow us to construct complex data landscapes for exploring
the physical properties of geological units and the processes that created them

Very soon now we will have the ability to create, visualise and interrogate complex data-landscapes where we can see data patterns in the context of existing understanding about the processes and environments that shaped them. This juxtaposition of raw data and ideas will super-charge our solution of geological problems. It will show how good the fit is between our data and the ideas and processes that purport to explain them. We can more easily test those ideas against other categories of data that we might expect to behave in a certain way if our interpretations are correct, and where it is not a good fit it will create the space for new ideas and concepts.

In geology, creating this new reality will likely revolve around linking up our datasets and data mining our publications. Creating richer linkages between datasets removes the need for multiple data searches for different categories of information about the same data-point. Work on this aspect is already in progress at BGS, and this alone will be a huge advance in how we do our future science.

A possible framework for organising text-mined geological data

So what might the future look like? A potential scenario is the approach used by Macrostrat, outlined in a lecture at the BGS by Professor Shannan Peters in May 2018. In this approach, relevant information in databases and text-mined from publications is linked to geological successions for particular regions that are characterised by a particular history of events and processes. This approach does not attempt to definitively model stratigraphy, it simply uses the combination of geographical position and stratigraphy hierarchy as a framework for organising data associated with that region and showing how it relates in time to adjacent regions. Through integration with GIS and 3D visualisation technology, such a system for the UK would allow geologists to see both the stratigraphical and spatial extent of their data – the two most fundamental requirements in understanding and resolving most geological problems. Allied with concepts harvested through text mining, we will create a powerful new tool that will super-charge our science into the future.

For more information on BGS's datasets, please visit our data website