- challenges and solutions

An open source cloud based platform for greenhouse gas data analysis and collaboration.

See on GitHub

Why OpenGHG?

The rapid growth in remotely sensed and in situ greenhouse gas (GHG) observations will dramatically improve our understanding of the drivers of change in atmospheric radiative forcing. However, the volume and diversity of available data presents a range of challenges. For example, efficient, near-real-time sharing and inter-comparison of data and model outputs is hampered by strict institutional firewalls and, in some cases, lack of computational expertise by data providers and users.

The data span a range of scales and sample different parts of the atmosphere, so inter-comparison and interpretation requires the use of Chemical transport models (CTMs); the “inverse” statistical methods for inferring fluxes using the data and models are computationally intensive and technically challenging to implement.

At present, these limitations mean that GHG flux estimates are generally only carried out on a case-by-case basis for specific research projects, each requiring intensive investigator effort. To address these challenges, we present a feasibility study demonstrating a cloud-based data analysis “hub” for the GHG community. We have brought together measurement, modelling, statistical and cloud-computing expertise to build the architecture for a cloud framework that will streamline the process for data sharing, validation, analysis and visualisation.

Using open-source tools, this framework will be extensible by GHG scientists to carry out the full workflow from data acquisition to operational GHG flux estimation. Such a system will allow us to more effectively integrate data from multiple sources and ultimately provide stakeholders and researchers with more rapid, more robust estimates of GHG sources and sinks.


Current challenges when dealing with greenhouse gas data include having both a wide and diverse range of measurements spanning a range of scales (e.g., urban to global) and range of non-standard formats leading to difficulty in inter-comparing datasets. To add to this institutional firewalls often form a barrier to sharing of data resulting in lack of reproducibility and transparency in the emissions evaluation process.



The OpenGHG platform aims to solve these challenges by providing a platform for greenhouse gas data analysis. It will allow comparison of data with vital ancillary information such as atmospheric model output, emissions inventories, and mapping tools. The platform will also provide key analysis methods and functionality. We do not plan on creating another long-term data storage repository, we want OpenGHG to be a platform that facilitates sharing and analysis of archived greenhouse gas data.


The OpenGHG Cloud will be available through simple to use web interfaces and Jupyter notebook. A web interface will allow upload and simple analyses to be performed and a JupyterHub/BinderHub will allow complex analyses to be developed, hosted and shared.



Run complex simulations on demand, allowing creation of striking visualisations that help transfer knowledge effectively.


Using the inherent scalability of the cloud large scale simulations can easily be run.