Only two joint fellowships between the Economic and Social Sciences Research Council (ESRC) and The Alan Turing Institute were awarded in their first year, and LIDA’s Professor Alison Heppenstall holds one of them.

The fellowships were established to bridge the gap between the social sciences and big data and ensure that data science can be harnessed to address societal challenges. With a research background across both disciplines, Professor Heppenstall was an obvious candidate for this prestigious award.

The fellowships focus on ‘smart cities’. The aim is to draw on the huge datasets generated within an urban environment to better understand how cities work and find ways to improve the lives of those who live, work and play within them.

Urban data sets can come from public transport, footfall cameras, retail transactions, weather stations or air quality monitors, for example. Professor Heppenstall is looking at using information from social media and mobile phones. Although it’s not possible to identify individuals through these data sets, to protect privacy, they can show time, location and, through keywords in tweets, identify characteristics such as age or gender.

Professor Heppenstall explains: “Census data provides information on where people live and mostly spend their nights; the journey to work data provides information on who is moving through the city at rush hour. But we know far less about who is using our cities during the day, such as students, non-working parents, the unemployed and retired. Merging footfall and social media data could help to fill that gap.”

Professor Heppenstall’s research involves data-wrangling – bringing multiple data sets together in order to extract more meaningful conclusions. Developing machine learning methods that can spot the patterns and gaps in the data.

These patterns can be used to inform the design of what’s known as an agent-based model to interrogate chosen data sets. These models are given their name because they assign behaviour and characteristics to individuals and then see the impact of that behaviour. Allowing the model to more accurately represent the reality of a city, where thousands of individuals are interacting all the time.

The applications are numerous, from determining what encourages different ethnic groups to live in certain areas, identifying who is most exposed to air pollution, or how big events impact on traffic flow.