8 Jul 2021

Why COG-UK is joining the UK Health Data Research Alliance

By Sharon Peacock

Earlier this month, the COVID-19 Genomics UK (COG-UK) consortium joined the UK Health Data Research Alliance. By becoming a member of this growing partnership, COG-UK will achieve three important objectives: we will work with Health Data Research UK (HDR UK) and over 50 “Alliance” members, creating a unified approach to the use of health data across the UK; we will further support the UK’s response to COVID-19 by more widely sharing our expertise in genome sequencing; and we will be able to make our datasets securely available for researchers to request access via the HDR Innovation Gateway.

Given that COG-UK was established in response to the COVID-19 pandemic in March 2020, and that the Alliance itself is only two years old, this is a great example of the incredible levels of collaboration that are so rapidly taking place across our sector.

And one of the most important lessons of the last 18 months is that collaboration is key. The incredible impact of health research in supporting the response to the pandemic has not been achieved by one organisation or in siloes. It has been a genuine success story for “Team Science”. By joining the Alliance, we’re committed to maintaining that approach.

The UK Health Data Research Alliance – collaboration in action

In its role as the national institute for health data science, HDR UK established the Alliance to be a driving force of this collaborative approach.

Bringing together multiple health research organisations, which are all using data at scale to share learnings and best practice, makes a huge amount sense. Ultimately, we all have the same motivation – to use the UK’s data assets in a trustworthy and secure way to develop discoveries that will benefit patients and save lives.

So, COG-UK is looking forward to working with other members to develop and coordinate the adoption of tools, techniques and technologies across five priority areas:

  • Promoting participation across the sector and improving access to health data
  • Aligning data standards and quality across the UK
  • Agreeing approaches to Trusted Research Environments (TREs)
  • Engaging practitioners, patients, and the public
  • Supporting the development of the HDR Innovation Gateway (“The Gateway”) as the platform for researchers to request access to datasets

Sitting at the interface of public health action and academic research, COG-UK is committed to open science and sharing all data that we can as rapidly as possible, which aligns perfectly with the principles of open science encouraged by the UK Health Data Research Alliance.

Data linkage – a key component to advance research and understanding of COVID-19

Access to large-scale standalone datasets is of course important; but in some cases, this only tells one part of the story. The ability to link these datasets and perform analyses on them allows researchers to start piecing puzzles together – it increases the depth of research possible and, with genomic sequencing, allows researchers to understand more about the effects of different SARS-CoV-2 variants.

Since May, we have been working with HDR UK, the four UK Public Health Agencies and other data custodians to progress data linkages between important COVID-19-specific datasets and other routine health and administrative data (“The Research Data Linkage Group”), with viral genomic data from COG-UK identified as priority data for linkage due to its value in understanding COVID-19, and therefore potential to advise policy makers.

Linking viral genomic data with other routine data can help researchers answer the following critical questions:

  1. How do SARS-CoV-2 mutations impact on the severity of disease, the transmission of the disease, and the outcomes including risk of “Long COVID”?
  2. Is there any interaction between viral mutations and human genomics that influence severity and outcomes?
  3. How do different treatments work on different SARS-CoV-2 variants?
  4. How do different variants spread in different groups of people?

Working together, we have been able to securely transfer viral genomic data into TREs around the UK, where data can be stored safely and accessed by approved researchers. Once in these TREs, the data can then be linked to other routine health and administrative data, such as data from hospital care and demographic data, such as ethnicity, to answer these questions.

Improving coverage and access – a four nations approach

One of the distinguishing factors of HDR UK’s approach is that it works across all four nations of the UK, ensuring that our research and insights are not viewed in isolation.

Data about the different variants of UK SARS-CoV-2 found from English patients now flows into the Secure Research Service at the Office for National Statistics (ONS).

Data containing the full viral sequence of UK SARS-CoV-2 variants from English and Scottish patients will imminently flow into the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh. Also held within the EPCC is data from the ISARIC Coronavirus Clinical Characterisation Consortium (ISARIC 4C) which looks at what happens to patients with COVID-19 in hospital. Linking the ISARIC 4C data with the fully sequenced viral genomic data is crucial to understanding more about the effects of the different SARS-CoV-2 variants.

Viral genomic data from Welsh patients is securely stored in the SAIL databank, which is managed by the University of Swansea, and viral genomic data from Northern Irish patients is soon expected to flow into the Northern Ireland Honest Broker Service TRE.

As part of the Data and Connectivity National Core Study, researchers can request access to these anonymous and public datasets on UK SARS-CoV-2 viral genomic data, available via the Health Data Research Innovation Gateway.

Looking ahead, the ongoing generation of SARS-CoV-2 sequence data across the UK will continue to inform the response to this pandemic. Working with HDR UK will allow us to make more viral genomic data available to researchers, so that maximum benefit to public health can be derived from this important resource.

Request access to COG-UK datasets via the HDR Innovation Gateway.


More information

COVID-19 Genomics UK consortium


Watch Sharon Peacock discuss COG-UK at the HDR Scientific Conference

COVID-19 Genomics UK (COG-UK)

The COVID-19 Genomics UK (COG-UK) consortium works in partnership to harness the power of SARS-CoV-2 genomics in the fight against COVID-19.

Led by Professor Sharon Peacock of the University of Cambridge, COG-UK is made up of an innovative collaboration of NHS organisations, the four public health agencies of the UK, the Wellcome Sanger Institute and sixteen academic partners. A full list of collaborators can be found here.

The COVID-19 pandemic, caused by SARS-CoV-2, represents a major threat to health. The COG-UK consortium was formed in March 2020 to deliver SARS-CoV-2 genome sequencing and analysis to inform public health policy and to support the establishment of a national pathogen sequencing service, with sequence data now predominantly generated by the Wellcome Sanger Institute and the Public Health Agencies.

SARS-CoV-2 genome sequencing and analysis plays a key role in the COVID-19 public health response by enabling the identification, tracking and analysis of variants of concern, and by informing the design of vaccines and therapeutics. COG-UK works collaboratively to deliver world-class research on pathogen sequencing and analysis, maximise the value of genomic data by ensuring fair access and data linkage, and provide a training programme to enable equity in global sequencing.