In May 2020, at the height of the COVID-19 Pandemic, DATA-CAN – HDRUK Hub for Cancer joined forces with the PIONEER Hub for Acute care to run the DeCovid Project. Set up by Wai Keong Wong at UCLH and Liz Sapey at Birmingham, it involved the creation of a high resolution COVID-19 data resource based on Acute Care. An alliance was built between UCLH and UHB with plans to on-board other digitally mature trusts; LTHT was the first, who are able to collect a comprehensive dataset without the need for CRF completion by clinical staff. Data were to flow via an Azure cloud at UHB to the secure research environment within the Turing for data scientists across the country and in particular those from centres who are contributing data.
LIDA were asked to provide data scientist support to LTHT for the urgent requirement. Two PhD students were selected, Tom Illet and Charlotte Sturley. They started a 3-month placement with LTHT on 8 June 2020. Their PhDs were put on suspension for this period. The supervisor was the Chief Data Officer of DATA-CAN Monica Jones and the students worked most closely with John Corkett on the Research Data and Informatics Team (R-DIT) at LTHT.
Principally they were tasked with constructing a database to house Electronic Healthcare Records (EHR) for secondary (research) purposes. This work would serve two purposes; first as a place to prepare and quality-check the data required for the national DeCOVID project and secondly to provide an additional infrastructure resource for the R-DIT team and other LTHT affiliated groups to facilitate the generation of bespoke datasets as needs arise in the future.
The database is built closely to the Observational Medical Outcomes Partnership (OMOP) CDM (Common Data Model) specification as maintained by The Observational Health Data Sciences and Informatics program (OHDSI), although some (relatively small) alterations were made to this by the DeCOVID team for their purposes. HL7 FHIR mappings were also provided to allow automation of the data feed and transfer to UHB. This will be shared more broadly across the HDRUK community.
The database is located within the Yorkshire and Humber Care Record (YHCR) Data Ark which is a collection of digital resources and services housed within the Google Cloud Platform (GCP). Configuring such a database at scale and in the cloud requires many steps to be followed carefully and precisely, therefore they built a web application (named Ark) to manage this process and ensure consistency. This system has now been handed over to the team at THT and can be used for other projects as well. It was an excellent collaborative project bringing value to the NHS, University of Leeds and all the individuals involved.