Data Linkage

Data linkage describes bringing related data together from different sources to create new, richer datasets

What is data linkage?

Data linkage is bringing related data together from different sources to create a new, richer dataset. The ability to link datasets and perform analyses on them increases the depth of research possible and enables more to be understood about the effects of different SARS-CoV-2 variants. To maintain patient privacy and ensure appropriate usage and attribution, data are stored securely and controlled access is enabled through approved pathways.

Which data are involved?

SARS-CoV-2 sequencing data is being linked to a number of different datasets including:

  • Host sequencing (GenOMICC and UK research cohorts)
  • Clinical severity (ISARIC 4C, ICNARC CMP)
  • Hospitalisation outcomes (PHOSP)
  • Acute care admissions data (England and Wales)
  • Vaccination data
  • Infection control in hospitals (HOCI data)

Who are COG-UK’s partners?

COG-UK is part of the UK Health Data Research Alliance, joining over 50 alliance members in creating a unified approach to the use of health data across the UK. COG-UK datasets will be securely available for researchers to request access via the HDR Innovation Gateway.

The Outbreak Data Analysis Partnership (ODAP) is a UK wide research collaboration. It aims to answer high priority research questions relating to infectious disease outbreaks and to link viral sequencing with complex datasets such as host sequences, clinical severity and outcome data, acute care admissions data, vaccination and HOCI data. COG-UK director Dr Ewan Harrison is co-director of ODAP, ensuring ongoing integration with COG-UK.

Learn more about HDR UK in our blog.


Read about the role of data linkage during the pandemic in our blog.

Watch COG-UK deputy director Dr Ewan Harrison talk about plans for data linkage at the COG-UK Together event in October 2021: