Discovering the effect of cloud organisation on Earth's climate with unsupervised machine learning

Climate models don’t agree on how clouds will change in our future climate, which has huge implications for our ability to predict how much the Earth will warm. It’s impossible to model every single meter in the atmosphere and so climate models must model the effect of clouds. But studying the clouds by applying self-supervised machine learning directly to the vast amounts of satellite data available provides a unique opportunity to understand cloud formation and therefore improve climate models.

Facebook’s VP and Chief AI Scientist, Yann LeCun, a pioneer in the area of self-supervised learning, describes it as being, “one step on the path to human-level intelligence (in computers)”. The technique is a trainable system that, given two inputs, x and y, tells us how similar or different they are to each other. In the past researchers have struggled to understand how clouds form and develop but using this kind of unsupervised machine learning in a novel approach, researchers have been able to study the vast amounts of satellite data available and try to understand why different forms of clouds develop and how they effect the Earth’s climate.

This work is being undertaken at LIDA as part of a big international collaborative research project called EUREC4A and provides us with a unique opportunity to understand cloud formation. It is the first unsupervised neural network model that autonomously discovers cloud organisation regimes in satellite images, with important implications for weather prediction and climate modelling. The UK part of the project (EUREC4A-UK) is trying to address one of the Grand Science Challenges, examining the link between clouds and circulation, and the role of clouds in climate change. It is led by Dr Leif Denby at the University of Leeds and National Centre for Atmospheric Science (NCAS), with external partners including the British Antarctic Survey (BAS), the Met Office and the University of Manchester.

The central question in this work relates to one of the most uncertain parts of our current climate models: representation of shallow trade-wind convective clouds, which have a cooling effect on the Earth. As the climate warms, it is vitally important to understand what is happening to these clouds in the Tropics. Climate model calculations differ from each other by several degrees, something that has overwhelmingly been attributed to the cloud representations (called parameterisations) that climate models need due to their coarse resolution (a typical climate model runs at 50km resolution, and these clouds generally are on the order of 1km).

Simulations show the clouds tend to organise into different patterns, which affects how much area they cover, which in turn affects how much of the sun’s radiation gets through. This research is specifically aimed at trying to see and understand all these patterns, and find out why the models are not behaving correctly. Dr Denby wanted to understand what drives the patterns, and wondered if machine learning could work out what kinds of patterns exist. This had never been done before. His first paper was on how to train a network to do this add link – https://agupubs.onlinelibrary.wiley.com/doi/epdf/10.1029/2019GL085190

Research led him to an unsupervised machine learning technique called ‘producing and embedding’, where the network extracts information from data it is provided with, to create an intermediate representation.

“Machine learning with things you know about – cats and dogs, for example – is straightforward,” says Dr Denby. “Coming up with a technique that can work out that these images contain cats and dogs, and some other animal is very difficult. Even working out how to formulate this is hard to do. How do you measure something you haven’t given a number to?”

The University of Leeds has capability for working with large amounts of data on HPC systems equipped with GPUs for machine learning, ARC3 and ARC4, and supplimented by the national JASMIN system– and there are currently three student projects working on a first-of-its-kind study cloud-resolving global simulations (~4km resolution) called DYAMOND, looking at cloud organisation. For Dr Denby’s work, the infrastructure within LIDA enabled access to, and download of, terabytes of satellite data. “I worked through all the satellite imagery, feeding this to the network to find out what things were seen, then next we tried to understand what was happening with these embeddings, to extract a manifold that would allow study of the patterns in a continuous manner,” says Dr Denby. His new ground-breaking work is on extracting this manifold and using it to confirm patterns previously identified manually. He is writing his technique so that it can be applied to any kind of geophysical image data, such as temperature or picturing the ocean surface, to make embeddings and find patterns.

“I would love to apply this technique to other fields,” says Dr Denby. “Looking into the formation of tropical cyclones, for example – what needs to exist in the atmosphere for cyclones to form.”

Dr. Leif Denby

Further information:

https://agupubs.onlinelibrary.wiley.com/doi/epdf/10.1029/2019GL085190

[email protected]

Cloud organisation “map” displaying the continuous nature of similarity between different forms of cloud organisation. The map was produced by applying the Isomap method to extract the manifold spanned by embedding vectors for the satellite tiles shown, the embedding vectors in turn were produced by a neural network trained using unsupervised learning with these satellite tiles as input.