Wednesday, 21 September 2016

BGS Hackathon: Tracking change in global volcanic activity... by Katy Mee

The BGS Volcanology team receive regular enquiries from government and media about volcanic activity around the world, particularly if it’s likely to have an impact on UK citizens abroad or there’s potential for humanitarian needs. When we get an enquiry, we normally first consult the appropriate volcano monitoring institution (volcano observatory) for status reports and the regional Volcanic Ash Advisory Centre (VAAC), which produce notifications for aviation about volcanic ash in the atmosphere. Volcanic eruptions may last for weeks or longer and changes in eruption style, intensity and scale can occur over  minutes to hours, so it can be challenging to keep track of what is happening and where new activity is occurring….…this is how our idea for the BGS Hackathon was born……

We pitched a challenge to take the advisories issued by the VAACs and see whether our hackers could use their programming skills to (1) automatically extract the relevant information, (2) populate these data into a database and (3) visualise the data on a map or graph.  Ideally, all of these things would be updated automatically as new ash advisories were released so that we could track changes in activity and look for new volcanoes erupting, in near real-time.

About volcanic ash advisories

There are 9 Volcanic Ash Advisory Centres (VAAC), each with a defined area to monitor, that cover most of the globe. The VAACs use a range of information – from volcano observatories, satellite and ground-based remote sensing, pilot reports and aircraft observations, and weather forecast and dispersion models – to identify and monitor the movement of ash in the atmosphere. They then issue advisories and guidance products for the aviation sector telling them where, how high and in which direction the ash cloud is moving so that pilots can avoid them.

The nine Volcanic Ash Advisory Centres (VAACs) and their areas of responsibility

 The advisories are in a standard format, meaning that they all provide the same categories of information e.g. the volcano name and ID number, summit height, information sources, height of the observed ash cloud, forecasted movement of the ash cloud, date that the next advisory is expected etc. If we could extract information detailing which volcanoes had ash advisories issued for them, when and how often they were being issued, we could produce a graph tracking this change, which could help to:
  • Look for new eruptions at volcanoes
  • Look for the end of an eruptive episode
  • Track the number and frequency of advisories issued in near real-time
  • Look for patterns in eruptive behaviour over the longer term.
Typical structure of a volcanic ash advisory

Team VAAC   

After pitching our challenge at the Hackathon, there was a slightly awkward wait whilst hackers decided which challenge they wanted to accept, but luckily for us, we enticed 5 very enthusiastic hackers with a wide range of skills. Team VAAC consisted of Charlie Kirkwood (environmental geochemist), Ailsa Napier (web developer), James Passmore (GIS and web specialist), Peter Stevenson (geomagnetic data analyst) and Carl Watson (geoinformation business analyst). Between them, they had a wide range of skills that were ideal for our hack challenge: computer programming, databases, geographical information systems (GIS), mapping and visualisation, building websites and…..making pizza – what more could you want?
Head scratching and concentration from Team VAAC

Extracting information from the ash advisories

Task one was to start scraping the relevant data from the ash advisories which are written in HTML code. Three of our hackers – James, Peter and Charlie – started working on this problem, each using the programming language they were most familiar with: ColdFusion, Python and R, respectively. Their task was to write a script that could extract the following information from the advisories and output this in a format that could be easily ingested into a database:
  • Date of the advisory
  • VAAC name
  • Volcano name and ID number (VNUM)
  • Geographic area
  • Date and time of next advisory
ColdFusion was quickly scrapped because we were getting quicker results from the other two methods and so we decided to streamline our resources into those two options. The R script was struggling to process the HTML code, so Charlie decided to work on an archive of advisories in text format from one of the VAACs. This not only gave us a different format to test the R script on, but also give us a back catalogue of data to help us analyse any patterns in activity over a much longer time period. By the end of the day, both Peter and Charlie had successfully written scripts, in Python and R, to extract the data and export these as either .CSV or .XLS files.
 Scraping information from the ash advisories using Python (left) and R (right) programming languages

Data via RSS feeds

As well as checking the VAACs, we regularly check over 30 volcano observatory websites for activity updates on dozens of active volcanoes. Although we receive reports and updates directly from many observatories we’re looking for rapid and global updates. Extracting this information into one database would be programmatically very challenging so Ailsa looked for RSS – or ‘Rich Site Summary’ – feeds, which use a standard web feed format to publish frequently updated information, such as volcanic activity reports. Subscribing to an RSS feed means that web updates are sent directly to you, meaning you don’t have to constantly check the website for updated information. Unfortunately, none of the VAACs use RSS feeds – which would have saved us a lot of time – but several of the observatories do. Ailsa was able to use RSS feeds from the USGS (United States Geological Survey) and KVERT (Kamchatka Volcanic Eruption Response Team) to compile relevant information from both websites onto a single webpage. This shows that for any observatories using RSS feeds, we could quite easily compile their information into one place, saving us lots of time in visiting many different websites.

An RSS feed of current volcanic activity updates issued by USGS (left) can be used to populate our own list of activity updates, saving the need for us to check individual websites

Building the database

Whilst Charlie, James and Peter worked on the scripting, Carl began work on the database. Having already identified what attributes the database table should contain it was fairly straightforward for Carl to set up a new database in Oracle. The ‘NOTE_ID’ was automatically generated whilst all other values were taken directly from the two scrape processes, which were loaded into the database manually. Ideally, we would have wanted these data to be automatically uploaded into the database to save someone from doing it manually, but this would have required writing a separate script to handle this process – something we didn’t have time to address during the hack.


Interlude….

As Day 1 came to an end, the hackers were treated to a healthy concoction of sweets, fizzy drinks, takeaway (and maybe a sneaky beer) before heading off to their programmer pits. And if one round of pizza wasn’t enough, we were treated the following morning to a second dose courtesy of the culinary delights of chez Passmore! Everyone loves cold pizza, right?
Famous takeaway pizza outlet vs Pizzeria Passmore

Web viewer and visualisation tool

The final element of our challenge, after all of the scraping, databasing and stuffing our faces, was to display the results on a web viewer, preferably updated in real-time. So once Carl had set up the database, he and James set to work on adapting an existing web viewer that had been created for the EPOS (European Plate Observing System) project (thanks Simon Burden!), for our needs – the true sense of hacking! The ‘hacked’ web viewer had 4 windows which showed:
  1. A map of all volcanoes, highlighting those with current ash advisories
  2. A list of all volcanoes with current ash advisories issued for them, from which you could select any volcano of interest
  3. The details for the selected volcano that have been extracted from the volcanic ash advisories
  4. A graph showing the number of advisories issued per day for the selected volcano
The windows in the viewer should all be linked so, for example, if you click on a volcano in the map, its details will appear in the other windows, or if you click on a volcano in the latest advice list, it should zoom to the volcano on the map. We did manage to get most of the windows linked apart from the map window, which you could pan around and zoom into independently. This is something that could have been easily linked we just ran out of time!

So all in all, we managed to achieve the majority of our aims in (1) extracting relevant data, (2) building and populating a database and (3) creating a web viewer to display our results – quite an achievement in less than 2 days! We didn’t manage to link all of the elements so these had to be done manually, and we didn’t quite manage the automatic updates so that we could monitor new volcanic activity in real-time.

Web viewer showing 1) map of volcanoes, 2) volcanoes with current ash advisories, 3) details from selected advisory, 4) graph of number and frequency of advisories for the selected volcano

WINNERS!

A bit more time was what we wished for……and a bit more time was what we got!! After the shock announcement that Team VAAC were being crowned hack champions (so unexpected that half our team had already left!), we discovered that our reward was more work – 15 days extra to be precise! Oh, and a coaster!

Members of Team VAAC receiving their winners’ coasters.
Note that Carl and Ailsa had little confidence in our chances of winning and had already left :)
(From left: James, Peter, Katy and Charlie).

Since the Hackathon

Peter has continued working on his Python script to extract data from the VAACs and produce an automatically updated map showing number of advisories over time for all volcanoes with current activity. The graph shows data for the past 7 days (from the current date) and plots the cumulative number of ash advisories, for each volcano, over time. When an advisory states “NO FURTHER ADVISORIES” the count returns to 0, signalling that there are no more ash observations. The graph shows which volcanoes have had particularly intense periods of activity, such as Sinabung in Indonesia, which had 6 advisories issued in the space of 18 hrs. We can also see which volcanoes are producing ash clouds over the course of the week, shown by more than one spike e.g. Sinabung and Klyuchevskoy (Russia). If we were to track this sort of activity over weeks, months and years and compare with other information, it’s likely that we’d start to see patterns emerging for certain volcanoes, helping us better understand how good observations of different eruptions are, how volcanoes and ash clouds behave, and thus how to best interpret such information in the future.

Automatically updated graph showing cumulative number of advisories issued per volcano over past 7 days


No comments: