This study investigated the geodemographic attributes of patients most affected by the drop in the urgent cancer referrals pathway during the pandemic.
The COVID-19 pandemic had a detrimental impact on a wide range of healthcare delivery . This included urgent cancer referrals which dropped by approximately 60% in April 2020 (the first lockdown), compared with referrals made at the same time in the previous year (April 2019) . Other studies suggest that the figure was even higher, 70% at the nadir . It is estimated that between 240,000– 740,000 patients have missed the urgent cancer referrals target since the start of the pandemic . This study investigated the geodemographic attributes of patients most affected by the drop in the urgent cancer referrals pathway during the pandemic. This will be useful in making inferences of the social groups most impacted by the drop in urgent cancer referrals. These inferences will inform optimum resource redistribution as society recovers from the pandemic. Thus, this study has three aims:
- To assess the impact that the pandemic has had on two-week referrals for different population sub-groups, differentiated by their demographic characteristics.
- To explore the geographical dimensions of the impacts of the decline in urgent cancer referrals, e.g., the influences that the proximity to healthcare facilities and transport systems have on the urgent cancer referrals; and
- To combine the demographic and geographic variables using a clustering algorithm to provide a more comprehensive geodemographic grouping of patients for analysis.
Data and Methods
Urgent cancer referrals data were obtained from our project partners, Leeds Teaching Hospital NHS Trust (LTHT) and accessed via the Yorkshire and Humber Care Record Trusted Research Environment (TRE) on the Google Cloud Platform (GCP). These data contained 194,593 de-identified urgent cancer referrals that were made to LTHT between January 2016 and Nov 2021. The data contain: demographic variables (sex, age, ethnic group); suspected cancer type; the month and year of referral; and the Lower Super Output Area (LSOA) in which the patient lives, which was used to link to geographical variables.
External data sources comprising of two geographical attributes at the LSOA level, were linked to the clinical data. These were the Access to Healthy Assets and Hazard (AHAH) and the Internet User Classification (IUC).
The AHAH dataset provides measures of accessibility to geographical assets that could be deemed as healthy or hazardous at home locations and were used as a proxy for environmental quality  . The AHAH data were sourced from the CDRC (Consumer Data Research Centre) website through this URL: https://data.cdrc.ac.uk/dataset/accesshealthy-assets-hazards-ahah. The AHAH data contains multi-dimensional indexes derived from indicators from four domains and 16 inputs of accessibility: access to retail environment (five inputs), access to health services (five inputs), physical environments (three inputs), and air quality (three inputs).
The IUC is reported for 2018 as a classification that describes the way people live and interact with the internet within Great Britain . The IUC was used as a proxy for digital engagement. The IUC was also sourced from the CDRC website through this URL: https://data.cdrc.ac.uk/dataset/internet-user-classification.
In order to better understand the multivariate characteristics of patients in the dataset we used the K-prototype algorithm  to provide more insights of the cancer referrals patterns. K-Prototype partitions the data to produce distinct clusters which represent the characteristics of individuals in that cluster across a range of data points. The variables used to derive the cluster solution were: cancer type; age; IMD quintile; ethnicity; gender; internet usage; and the AHAH domains.
Figure 1 shows the sex of cancer referral patients over time. The data is 67% female and 33% male, with the orange and blue lines indicating the patterns of female and male referral respectively. Before the pandemic, the average monthly referrals for males were 888.35(±105.83), and for females was 1753.06 (±190.56). During the pandemic, the average monthly referrals increased for both males and females to 960.3 and 1908, respectively. However, the standard deviation for both male and female referrals more than doubled at (±213) and (±375.51) respectively, indicating a wide-ranging disruption brought by the pandemic. At the lowest dips in April 2020, the number of referrals for males was 405 and for females was 812.
Figure 2 focuses on the fluctuation of referrals during various stages of the Covid-19 pandemic. During the first lockdown, the number of referrals dropped dramatically to the lowest point of 56% below average monthly referrals. The number of referrals had a quick V shaped recovery when the first lockdown measure was lifted and replaced with tiered restrictions. There was another decline in number of referrals during the second lockdown. However, during the third lockdown there was a surge in cancer referrals uptake, leading to a sharp recovery at that time.
Differences by ethnicity
The baseline analysis presented in Figure 3 compares monthly cancer referrals for different ethnic groups during the COVID-19 pandemic (Jan 2020 to Nov 2021) with a baseline value for each ethnic group. The baseline value is calculated as the average monthly referrals in the year prior to the pandemic: Jan 2019 to Dec 2019. 79% of patients in the data are White British, 12% are one of 16 other ethnic groups and 8% of unknown ethnicity. The magnitude of the extent of declines in cancer referrals relative to the baseline varies by ethnic groups. For example, the White British group saw the largest drop in referrals in April 2020 at 56.24% below the baseline. The Other White group saw a 63% drop while the Pakistani group saw a 55% drop in April 2020. The speed of recovery back to baseline levels following the first lockdown also varies, compare for example White British which sees pre-pandemic levels of referrals in July 2021 to the Pakistani group where this occurs in September 2021.
Our analysis identified five distinct clusters of patient type, based on their demographic and geographic characteristics. One of the clusters which has not recovered to its pre-pandemic level has a deficit of about 22% less referrals as of November 2021 compared with its baseline value. The characteristics of that cluster can be seen in Figure 4, where the radar plot compares the cluster characteristics with the average value across all patients in the dataset, where a value of 1 is equal to the average for that variable. This cluster represents patients who are less deprived than average: the proportion of patients in Index of Multiple Deprivation category 4 is 2.5 times that of the general population and has little representation in the most deprived IMD 1 and 2 groups. This cluster has more older patients and fewer younger patients. The cluster also has a higher proportion of those who do not report their ethnicity. Across the AHAH variables, patients in this cluster have poor access to most of the Healthy Assets, apart from access to passive green spaces. In terms of access to internet, this group are 5 times more likely to be e-Rational Utilitarian than the general patient population. The e-Rational Utilitarian tend to comprise of elderly and middle-aged people who are mainly rural and semi-rural dwellers . They are more than 3 times likely to be referred from suspected Sarcoma cancer and 2 times other kinds of suspected cancer. They are also more likely to be referred for suspected urology, Brain/CNS and HPB cancer types. Generally, this group are more likely to be affluent rural dwellers.
Value of the Research
The value of this study is that it provides richer and more nuanced insights into the characteristics of patients by assessing demographic and geographical variables in relation to the two week cancer referral pathway. This research will be useful in providing clinicians with a broader 360-degree view of their patients, and how demographic and geographic characteristics potentially impact their health outcomes.
Using an unsupervised machine learning approach, cluster analysis, we have been able to identify the characteristics of the group that were ’most impacted’ by the various COVID-19 restrictions. This helps to identify the segments of patients who were most vulnerable and thus may require more attention in receiving faster cancer referrals and improved prognosis.
“This project was a successful collaboration between a number of partners including the University of Leeds, Leeds Teaching Hospital Trust and DataCan (HDR UK). The outcomes of the project demonstrated how geodemographic data can be attributed to patients referred for cancer two-week waits. This work highlights new avenues of research across a multi-disciplinary team and presents opportunities for future collaborations in Health Data Research UK.”
Ifeanyi Chukwu – Data Scientist (LIDA)
Dr Zach Welshman, LIMR, University of Leeds
Monica Jones, Data-Can, University of Leeds
Dr Nik Lomax, Geography, University of Leeds
Prof Geoff Hall (MD), LIMR, University of Leeds
Leeds Teaching Hospital NHS Trust
Health Data Research UK (HDR UK)
References A. Carr, J. A. Smith, J. Camaradou, and D. Prieto-Alhambra, “Growing backlog of planned surgery due to covid-19,” 2021.
 E. Mahase, “Covid-19: Urgent cancer referrals fall by 60%, showing “brutal” impact of pandemic,” 2020.
 A. G. Lai, L. Pasea, A. Banerjee, G. Hall, S. Denaxas, W. H. Chang, M. Katsoulis, B. Williams, D. Pillay, M. Noursadeghi, et al., “Estimated impact of the covid-19 pandemic on cancer services and excess 1-year mortality in people with cancer and multimorbidity: near real-time data on cancer care, cancer deaths and a population-based cohort study,” BMJ open, vol. 10, no. 11, p. e043828, 2020.
 G. Davies, R. Buckley, J. Fellows, D. Foster-Hall, S. Leveson, F. Lind, P. Wright-Anderson, and T. Phillips, “NHS backlogs and waiting times in England department of health amp; social care, NHS England amp; NHS Improvement,” 2021.
 K. Daras, A. Davies, M. Green, and A. Singleton, “Developing indicators for measuring health-related features of neighbourhoods,” Consumer data research, pp. 167–77, 2018.
 M. A. Green, K. Daras, A. Davies, B. Barr, and A. Singleton, “Developing an openly accessible multi-dimensional small area index of ‘access to healthy assets and hazards’ for Great Britain, 2016,” Health & place, vol. 54, pp. 11– 19, 2018.
 A. Alexiou and A. Singleton, “The 2018 internet user classification,” 2018.
 Audhi Aprilliant “The k-prototype as Clustering Algorithm for Mixed Data Type (Categorical and Numerical)”. Towards Data Science, https://towardsdatascience.com/the-k-prototype-as-clustering-algorithm-for-mixed-data-type-categorical-and-numerical-fe7c50538ebb, 2021. Accessed 28/04/2022