The latest figures from the IPCC (Intergovernmental Panel on Climate Change) show clearly that there has been no significant increase in the surface temperatures of the planet for the last fifteen years. Does this mean that global warming is no longer happening or that all of those climate change models that have been developed at vast expense over the last few decades are incorrect?
The latest explanation is that more heat than usual or expected is being absorbed by the oceans and that there has been no real slow down; the earth is still getting warmer. However, at the least, this is some mitigation for those who remain a little sceptical regarding climate change; the so-called “climate change deniers” . Climate modelling today relies on big data. Managing this data is a huge challenge. Every day literally billions of observations are made and integrated from sensors of various kinds located all over the world, in weather balloons and in orbiting satellites. Historical data must be reanalyzed and new climate change simulations are made using clusters of supercomputers and each simulation can generate huge quantities of data. For instance, currently the NCCS (NASA Centre for Climate Simulation) data base stores around 40 petabytes of data and the quantity is increasing daily.
Handling this amount of data presents problems, for instance how to find a single needle in stack of 40 trillion of them? It is completely impossible unless you know where every individual needle is located. New tools for handling this data are continually being developed. For instance the NCCS has developed a 6 meter by 2 meter high resolution Visualization Wall backed by 16 servers for its supercomputer cluster (called Discover) that can display visual content generated from the data.
As an example of the power of this resource, Discover is able to simulate three days on the earth in one day of real time at a resolution of 3.5 Km which equates to a grid of 3.6 billion cells, and the long term objective is to increase this to 365 days at 1 Km resolution.
Climate change is not the only scientific big data user. For instance at CERN (European Organization for Nuclear Research) in the LHC (Large Hadron Collider) 600 million times every second there is a particle collision and each one results in a complex array of other particles all of which are detected and recorded. The amount of data generated is huge; around 16 petabytes a year that must be analysed by physicists searching for new particles and processes. Using GRID computing this data is made available to a worldwide team of around 8,000 scientists.
Today Big Data is no longer the preserve of big science. It is becoming increasingly important to commerce too. Although not yet on a scale to equal climate modelling, or the search for new physics, the rate of business data generation is increasing rapidly and new tools and approaches are needed to manage it. The challenges are such that many businesses are looking to third party organizations such as Mimecast which are able to provide cloud based storage solutions such as cloud archiving and data management for email and files.
Whether or not global warming is slowing down, data is simply getting bigger and will continue to do so.