Yet of the huge amount of geological data that BGS holds from borehole or surface and subsurface sampling, we actually acquire a relatively small of the data ourselves, the rest is deposited with BGS for statutory reasons. Therefore a new geological model might incorporate data collected by any (or even all) of the water, energy, minerals and construction industries, each delivered in their own formats to their own standards.
For the last 25 years, BGS has been holding data in normalised structured databases. Normalised data means holding every element of your records separately and once only, with keys to then link these fields together. This is highly efficient, it means that whenever you move house your credit card company only needs to change one address field and not every transaction record. The disadvantage of such an approach can be slow response time and complex data structures for scientists to interrogate. Creating a 3D model might involve interrogate 17 major databases, with 50 datatypes containing millions of records. Until now, each dataset has to be searched separately using different tools and then laboriously reformatted. So making a “first-look” 3D model could involve several days work before any interpretation can occur.
To solve this we’ve adapted another idea from the finance insurance industry and built PropBase the world’s first true geo-data warehouse. A data warehouse takes a copy of the original data (thereby ensuring its integrity), then reformats the data back together in a standardised structure, and outputs them in common formats. Given that all data used in 3D models has a broadly common structure (a location in 3D, then the datatype, its value and any qualifiers) they can be imported in a common way. Therefore PropBase outputs these data standardised into a common set of multiple output formats from each record (e.g. a GIS shape file, a CSV for importing into modelling packages or webservices for machine-to-machine interrogation) by simply flicking a “switch” to toggle between them. The key advantages are massively improved data response times to querying and standardised outputs so data can be imported much more quickly by modelling software, these same ideas are now being used as templates for more complex datatypes such as real-time streaming of sensor outputs.
|PropBase Explorer tool showing spatial search for physical property data within the area of interest defined by a |
Our new publication in Computing and Geosciences defines this new data structure and applies a model for how scientists can effectively access and serve complex multiple spatially enabled structures. If you find it useful please cite it. (Please note that this is behind a paywall, researchers who cannot access this should contact me).