Skip to main content

Data quality and causal inference in Learning Health System: some challenges for developing reliable algorithms in an imperfect world

LIDA Seminar
Friday 1 February 2019, 3:30pm - 4:30pm
Worsley building, University of Leeds, Clarendon Way, Leeds, LS2 9NL

The Leeds Institute for Data Analytics is pleased to present the next seminar in our series showcasing data analytics.

The seminar will be held in 9.58b, Worsley Building.


Globally, health systems are increasingly rich in data and starved of skilled professionals. “Learning Health Systems” (LHS) promise rapid learning cycles to turn routine data into reliable algorithms, which should help fill gaps in human capacity, improving the safety, quality and efficiency of healthcare. However, LHS rely on clinical data being research ready & bias-free, on inference methods that distinguish causation from association and on effective, professionally acceptable methods to disseminate the intelligence, such as clinical decision support systems. This talk reviews two key challenges in the LHS cycle: the capture of high quality health data [1-4] and causal inference from this data using propensity scoring, instrumental variable or regression discontinuity methods. This draws on two examples: estimating the impact on mortality of the drug ezetimibe from routine CPRD data [5] and using several casual inference methods to estimate the effectiveness of chemotherapy in 45,000 Scottish women with breast cancer [6]. I conclude that, before learning from data can become routine in health systems, we need more methodological research on tools to improve data quality and on the reliability of causal inference methods.

1. Tsopra R, Wyatt JC, Beirne P, Rodger K, Callister M, Ghosh D, Clifton IJ, Whitaker P, Peckham D. Level of accuracy of diagnoses recorded in discharge summaries: A cohort study in three respiratory wards. J Eval Clin Pract. 2019 Feb;25(1):36-43. doi: 10.1111/jep.13020. Epub 2018 Aug 14.
2. Tsopra R, Peckham D, Beirne P, Rodger K, Callister M, White H, Jais JP, Ghosh D, Whitaker P, Clifton IJ, Wyatt JC. The impact of three discharge coding methods on the accuracy of diagnostic coding and hospital reimbursement for inpatient medical care. Int J Med Inform. 2018 Jul;115:35-42. doi: 10.1016/j.ijmedinf.2018.03.015. Epub 2018 Mar 27.
3. Nathan PA, Johnson O, Clamp S, Wyatt JC. Time to rethink the capture and use of family history in primary care. Br J Gen Pract. 2016 Dec;66(653):627-628.
4. Mukherjee M, Wyatt JC, Simpson CR, Sheikh A. Usage of allergy codes in primary care electronic health records: a national evaluation in Scotland. Allergy. 2016 Nov;71(11):1594-1602. doi: 10.1111/all.12928. Epub 2016 Jun 3.
5. Pauriah M, Elder DH, Ogston S, Noman AY, Majeed A, Wyatt JC, Choy AM, Macdonald TM, Struthers AD, Lang CC. High-potency statin and ezetimibe use and mortality in survivors of an acute myocardial infarction: a population-based study. Heart. 2014 Jun;100(11):867-72. doi: 10.1136/heartjnl-2013-304678. Epub 2014 Feb 19.
6. Ewan Gray, Joachim Marti, David H Brewster, Jeremy C Wyatt, Romain Piaget-Rossel and Peter S Hall. Feasibility and results of four real-world evidence methods for estimating the effectiveness of adjuvant chemotherapy in early stage breast cancer. J Clin Epidemiol 2019 (accepted)


About the speaker

This presentation will be delivered by Dr Jeremy Wyatt DM FRCP, ACMI Fellow; Professor of Digital Healthcare and Director, Wessex Institute of Health Research, University of Southampton; Clinical Advisor on New Technologies, Royal College of Physicians & member of Devices Expert Advisory Committee, Medicines & Healthcare Regulatory Agency; former Leadership Chair in eHealth Research, University of Leeds.


To book your free place please email Hayley Irving with your name, occupation and faculty/organisation.