Synthetic Populations as a Tool for Propensity Modelling

Synthetic Populations as a Tool for Propensity Modelling

In health, economics, transport, and other fields, there is a growing recognition that effectively capturing the behaviour of complex systems calls for a micro-level, “bottom up” modelling approach. These models can simulate the small-scale heterogeneities and subtle interactions present in these systems, which often lead to important emergent behaviour that macro-level (“top down”) analysis would miss. This project, funded by Procter & Gamble, will apply this bottom up methodology to the retail sector; spatial microsimulation techniques will be used to construct a synthetic population, and then to model retail behaviours at the household level.

 

More specifically, detailed microdata will be combined with aggregate data in order to generate a feature-rich synthetic population of UK households. For example, a consumer survey can give lots of information on individuals’ spending habits, but might only sample a fairly small number of respondents. Aggregate data will be totals describing local-area-level characteristics, such as census cross-tabulations counting households by size and economic activity in a given area. A population will be built by replicating and distributing these microdata nationwide, and this distribution validated by checking against the aggregate data.

 

A variety of data sources will be incorporated, and deterministic and stochastic algorithms will be employed to ensure a high-quality population; the result will be representative and realistic, and each household will contain attributes useful for propensity modelling. Hypotheses of interest will then be tested using this population. For instance if a new product were planned – at a particular price point with branding aimed at certain segments of the population – we could model the response on a household-by-household basis: Who would switch to the new product? What products would they be switching from? And where in the country would uptake be highest?