Skip to main content

Incorporating geospatial climate data into statistical models of transport infrastructure and operations

Date

Extreme weather conditions are some of the leading causes of disruption to transport networks, causing delay incidents and damage to infrastructure on roads and railways. In the contexts of increasing utilisation and congestion of transport networks, as well as climate change, which is expected to drive an increase in the frequency and severity of extreme weather conditions, the ability to understand and accurately predict such incidents is increasingly important.

Project overview

This project integrates geospatial climate data into models of transport network operations and explores the use of statistical and machine learning methods to understand—i.e., make inferences about—and predict the impact of weather and climate variables on transport network performance. The first stage in this work is the use of GIS tools to map weather data to geospatial data on transport networks, e.g., road and rail networks. The second stage of this work is to analyse the impact of these weather factors on transport system performance via statistical, econometric, and machine learning methods. As an example, we developed statistical and machine learning models of the number of delay incidents on the French rail network that are attributed to extreme weather conditions and compared the performance of these models in explaining and predicting regional and temporal variation in incident numbers.

Data and methods

We used a range of geospatial datasets on the locations of road and rail infrastructure, together with geospatial data on weather conditions at a granular level, and contextual data on transport infrastructure and operations.

The figures below are examples of the geospatial representations of the weather variables and transport networks we mapped to generate the dataset containing incidents and weather data at segment level.

Figure 1. GIF image showing the change in snow lying as a weather variable over the UK from January 2010 to December 2021

Figure 2. Geographical map including coordinates of road network in Leeds. This map is built from individual coordinate information of each section of roads in Leeds, including motorways and unclassified roads.

In our work predicting the impact of weather conditions on delay incidents on the French rail network, we included predictor variables such as temperature, precipitation, and windspeed, along with control variables such as track length, asset types, and asset conditions.

Various traditional statistical models, such as Poisson regression, hybrid models like LASSO Poisson regression and ElasticNet Poisson regression, as well as machine learning algorithms like decision trees and random forests were developed to model the impact of the predictor variables on the outcome variable and also predict incident numbers out-of-sample.

Key findings

Hybrid models like LASSO and ElasticNet regression make it possible to perform data-driven variable selection because of the added penalty they introduce to the calculation of variable coefficients in the model. For example, using a data subset containing strong heat-related incidents, LASSO and ElasticNet regression methods select a group of weather and operational variables that make intuitive sense, and yield significant gains in in-sample and out-of-sample predictive power.

These techniques appear to yield predictive models that can be used to forecast future system performance and understand the likely impacts of future climate trends on transport systems.

Value of the research

From an academic and econometrics perspective, this work contributes significantly to the development of simplified and more efficient models that are data-driven.

Further, the methods employed in this research provide a template for mapping different geospatial data from different sources.

At the policy-making level, this work can facilitate the efficient allocation of resources to prevent or manage the occurrence of these incidents where applicable. This can lead to significant economic savings in rail operations.

Insights

  • It is possible to develop simplified and more efficient models using hybrid approaches that encourage data-driven variable selection
  • Various weather thresholds that really impact the occurrence of rail incidents at segment level can be identified.

Research theme

  • Environment

Programme theme

  • Statistical Data Science
  • Mathematical and Computational Foundations
  • Data Science Infrastructures

People

Toluwani Osabiya, Data Scientist, University of Leeds

Alexander Stead, Lecturer in Transport Economics, University of Leeds

Andrew Smith, Professor of Transport Performance and Economics, University of Leeds,

Phillip Wheat, Professor of Transport Econometrics, University of Leeds,

SNCF RESEAU

CQC Efficiency Network

Partners

SNCF RESEAU

CQC Efficiency Network

Funders

SNCF RESEAU

CQC Efficiency Network