Report by the COVID-19 Genomics UK (COG-UK) Consortium
Report 11: 8th September 2020 – COVID-19 Genomics UK (COG-UK) Consortium
Please Note: This report is provided at the request of SAGE and includes information on the ongoing state of the research being carried out. It should not be considered formal or informal advice. The conclusions of the ongoing scientific studies may be subject to change as further evidence becomes available and as such any firm conclusions would be premature.
- COG-UK researchers have collaborated with PHAs to use rapid genome sequencing to understand the dynamics underpinning a growing number of local SARS-CoV-2 outbreaks across the UK during the summer.
- Rapid genome sequencing coupled with integration of epidemiological data has enabled the identification of transmission points and informed intervention measures during outbreaks among highly vulnerable patients in renal dialysis units (RDUs) in Scotland and the East of England.
- In addition to existing COG-UK tools and pipelines, a newly developed sequence reporting tool that has been applied to the RDU outbreaks will provide simple statistical support to public health workers and clinicians seeking to understand whether infections seen at a local level represent transmissions within a given healthcare setting or transmission from the surrounding community.
COG-UK genomic surveillance in action
The sequencing pipelines, genomic data and tools that have been created during the six months since the establishment of COG-UK provide the foundations necessary for the consortium to transition into a new phase, that continues longer-term scientific research and development activities but increasingly also moves towards providing support for the operational genomic surveillance of local SARS-CoV-2 outbreaks and actionable feedback to infection management teams handling live outbreaks.
Accordingly, PHAs have been working to with COG-UK researchers to apply the national dataset to the investigation of situations and outbreaks and to identify the needs for analytic tool development. Further work is needed to strengthen and expand this work, and to ensure that data and metadata are available at a pace which can provide relevant insights to inform the local and national response.
The COG-UK phylogenetics pipeline based on the CLIMB infrastructure provides tools for public health workers to rapidly make assessments of potential outbreaks based on all COG-UK genome sequences and associated metadata. The Civet tool (see ‘Summary of major tools and pipelines developed by COG-UK’ COG-UK report #10 – 11th August) provides clear analytical reports that can put putative outbreaks in context, find links between clusters, and allows secure integration of private data about individual cases. An example of an analytic report from Civet is included in the Appendix.
For hospital linked outbreaks the ‘Sequence Reporting Tool’ (SRT; see ‘Development of a Sequence Reporting Tool’ COG-UK report #10 – 11th August) uses a probability model to integrate viral genome sequences and patient metadata to provide a simple statistical metric that can inform on the likelihood of a particular infection owing to transmission within a ward, in the wider hospital, or in the community. The SRT has been applied for the first time to an analysis of potential outbreaks in RDUs in Scotland (see ‘Using genomic epidemiology to tackle SARS-CoV-2 outbreaks in Renal Dialysis Units in Scotland and the East of England’, below).
Public Health Wales (PHW) has integrated the genomic data and analysis generated for Welsh sequences into outbreak responses, as a requestable service provided to epidemiologists within the NHS in Wales. In North Wales, genomic data is integrated into a real-time surveillance system which enables the analysis and comparison of cases in the community and hospitals. This tool has been used to examine hospital outbreaks, and to define the extent of outbreaks in hospitals in Wales (see ‘Integration of genomics and clinical metadata in real time to support outbreak management and response’ COG-UK report #9 – 25th June). More generally, within PHW, the outputs of COG-UK are fed to a network of Healthcare Epidemiologists who make use of a combination of MicroReact visualisations (running on a MicroReact instance within PHW), lineage assignments/phylotypes and trees to support local outbreak investigations. This enables the rapid/real time use of genomics as soon as an outbreak is identified. Where more substantial outbreaks are concerned, detailed analysis and interpretation is performed by the Pathogen Genomics team within PHW, with genomics reports being fed back to the relevant outbreak management team in real time.
In addition to the use of data within PHW to identify outbreaks, PHW also generates four indicators from genomic data in close-to-real-time which feed into the Welsh Government COVID-19 circuit breakers and early warning indicators (https://gov.wales/covid-19-circuit-breakers-and-early-warning-indicators).
Using genomic epidemiology to tackle SARS-CoV-2 outbreaks in Renal Dialysis Units in Scotland and the East of England.
Emma Thomson1,3, Patrick Mark2,3, Kathy Li1, Joseph Hughes1, Ben Warne4,5, Will Hamilton4,5, Ian Goodfellow6, Estée Török4,5
1. MRC-University of Glasgow Centre for Virus Research
2. Institute of Cardiovascular and Medical Sciences, University of Glasgow
3. NHS Greater Glasgow & Clyde
4. University of Cambridge Department of Medicine
5. Cambridge University Hospitals NHS Foundation Trust
6. University of Cambridge Division of Virology
Renal dialysis patients are among the most vulnerable to COVID-19 (with up to 30% mortality; https://renal.org/covid-19/data/ and Table 1, below). Renal dialysis units (RDUs) may provide a conduit between hospitals and the community environment owing to prolonged episodes of outpatient medical care. Most RDUs consist of large open rooms with no barriers separating patients, thus providing challenges for infection control. Furthermore, shared hospital transport and admission into other hospital units for co-morbid conditions may increase the risk of acquisition of infection. Studies in two cities (Glasgow and Cambridge) set out to determine how genomic epidemiology can be used to identify the likely source of SARS-CoV-2 infections in RDUs, understand the dynamics of transmission in potential outbreaks, assess the clinical impact on patients and inform infection control measures to protect this highly vulnerable group of patients.
Electronic patient records were used to identify anonymised clinical data for RDU patients with a positive diagnostic test for SARS-CoV-2 infection. SARS-CoV-2 genomes were sequenced for each infection according to COG-UK protocols and sequence data uploaded to CLIMB and GISAID. Phylogenetic analysis of the whole genome sequences were performed using established approaches (see ‘Summary of major tools and pipelines developed by COG-UK’ COG-UK report #10 – 11th August). In the Scotland study, a novel ‘Sequence Reporting Tool’ was also used for estimating the probability that an infection was acquired in the community or in a healthcare setting (either hospital or specific RDU ward; see ‘Development of a Sequence Reporting Tool’, COG-UK report #10 – 11th August). Comparisons were made between patients who lived and died following infection.
Data from both cities are presented separately below, although the patterns observed and implications for infection control were similar across both sites.
Cambridge RDU outbreak
During a wider prospective genomic surveillance study of SARS-CoV-2 infections in patients at Cambridge University Hospitals NHS Foundation Trust (Ref 1), six patients with end-stage renal failure were diagnosed with SARS-CoV-2 between the 1st and 20th April 2020. These six patients were spread across several locations, including the emergency departments and an acute admissions ward. The patients’ diagnostic samples were tested at the PHE Clinical Microbiology and Public Health Laboratory and underwent nanopore sequencing in the Department of Virology, University of Cambridge. The SARS-CoV-2 genomes were found to cluster together on a phylogenetic tree (Figure 1) and were identical at the genomic level, with zero SNP differences between them, indicating the potential that a common source of infection was shared by these six patients.
Epidemiological investigation revealed that all six patients dialysed at the same outpatient RDU on the same days of the week (Figure 2), suggesting linked recent transmission of community-onset, healthcare-associated infections.
Figure 2: Cambridge University Hospitals NHS Foundation Trust RDU outbreak timeline. Sample dates are shown in yellow circles. Six patients with end-stage renal failure were diagnosed with COVID-19 between 1 and 20 April 2020 (yellow circle with a letter indicating the cluster number). Yellow circles without a letter indicate patients diagnosed with COVID-19 found not to be related to the dialysis unit clusters. Black triangles indicate patient deaths. The darker green blocks represent the dialysis unit with suspected transmission; the light green and grey blocks represent different dialysis units. The renal ward is shown in blue and the emergency department in red. Other wards are shown in grey.
These findings led to a review of infection control procedures in the dialysis patients and identified shared patient transportation and neighbouring dialysis chairs as risk factors for transmission. Interventions included introduction of universal mask use for patients and staff (in particular during patient transportation), closure of the waiting room area, and improved social distancing measures.
Genomics was also found to be useful for “ruling out” linked transmission to other renal patients. For example, the renal ward (which shares patients with the outpatient dialysis unit) also had a group of COVID-19 cases at around the same time. However, the dialysis unit genomes belonged to lineage B.2 (relatively rare in the East of England), whereas the renal ward genomes were the common B.1 lineage, making it very unlikely that infections between the two patient groups were related.
Scotland RDU outbreak
Rapid whole-genome sequencing of SARS-CoV-2 generated by COG-UK was used to improve the understanding of transmission risks in high-risk renal haemodialysis cohorts at the six Scottish RDUs (Ref 2). Epidemiological, geographical, temporal and genetic sequence data from the community and hospital setting were analysed to estimate the probability of infection originating from within the dialysis unit, the wider hospital or the local community using Bayesian statistical modelling.
Of 671 patients, 60 (8.9%) became infected with SARS-CoV-2 and 16 (27%) died. 44 of the 60 patients were diagnosed as being infected with SARS-CoV-2 within a 14-day window of infection of another patient in the cohort attending the same dialysis shift/site, suggesting that within RDU transmission was potentially occurring while patients attended for dialysis, in addition to likely community spread outside of the RDU. Eight patients who were diagnosed with COVID-19 shared transport with another infected patient, during the same 14-day period of likely infectivity.
39 SARS-CoV-2 genome sequences from this cohort (plus one from a healthcare worker) were of sufficient quality for further analysis. 13 different UK SARS-CoV-2 lineages were detected. A number of patients were infected with near identical sequences from the same lineage, suggestive of linked transmission, while others did not cluster phylogenetically, suggesting community transmission. Epidemiological investigation identified clusters of SARS-CoV2 positive patients as sharing dialysis sessions and sometimes transport. Analysis of the phylogenetic and epidemiological data together using the novel ‘sequence reporting tool’ demonstrated that of the six RDUs, five look convincingly to have had unit-linked transmission events (Figure 3).
Figure 3: Timeline of detection of first SARS-CoV-2 positive results in Scottish haemodialysis patients in RDUs with details of dialysis sessions and shared patient transport in relation to the UK cluster. Circled number in the phylogenetic tree represents the number of identical sequences from Scotland for the given node on the phylogeny. The numerical suffixes of the CVR identifier indicate the posterior probability (as a percentage) of the patient acquiring SARS-CoV2 from the RDU or from another healthcare-related infection (i.e., hospital where dialysis takes place and ward and/or hospital they have been admitted to), respectively. The scale bar indicates substitutions per nucleotide site.
In RDU1, samples from seven patients and one healthcare worker fall within the UK40 lineage, with four sequences being identical to each other, was initially suggestive of within RDU transmission. However, lack of definitive epidemiological exposure and similarity to community sequences reduced confidence that the infections tool place in the RDU, with probabilities ranging from 0.53 to 0.68 (Figure 3). Early institution of masks during travel and dialysis for all patients in RDU1 was associated with a cessation of transmission, with the final case occurring nine days after enhanced PPE implementation.
RDU2 showed evidence of five SARS-CoV-2 introductions from the community resulting in the spread of two lineages within the RDU (or during transport to the unit) with 100% statistical probability. For one of the lineages, a patient with 100% probability of RDU transmission had separate dialysis sessions to the other two patients, suggesting an unknown transmission route (perhaps via an untested staff member).
RDUs 3-5 showed a mixture of community and RDU transmission events, albeit often via an unknown transmission route in the latter case. By contrast, 9 of the 16 patients from RDU6 for which a sequence was obtained were from the UK249 lineage and sequences were identical among patients who shared dialysis sessions. Statistical probabilities of within-unit transmission for infections in this RDU ranged from 0.85 to 0.97.
RDUs are high-risk environments for transmission of respiratory infections, with a high consequent risk of mortality (Table 1). Integration of rapid genome sequencing and epidemiology can identify multiple points at which transmission can occur, e.g. on the ward and on shared hospital transport. It can also be used to identify gaps in understanding of the transmission routes affecting an RDU. Interventions based on these insights can lead to cessation in transmission and should include reduction of risk in the ward setting, on hospital transport and within the community.
Table 1: COVID-19 incidence and mortality in RDU patients. UK Renal Registry; ERA-EDTA registry; Xiong et al JASN 2020; Alberici et al Kidney Int 2020; Valeri et al, JASN 2020; Fisher et al, Kidney360, 2020; Corbett et al, JASN 2020; Bell et al, MedRxiv 2020; Tortonese et al, KI Reports, 2020; Roper et al KI Reports 2020
Full viral genome sequences alone are not always sufficient to determine the source of a transmission, which will also depend on the frequency of different variants circulating in the community and in the hospital. As such, the incorporation of additional epidemiological information (such as dialysis session, shared transportation, information on negative screen and geographical data on cases in the community) can help to identify the source of transmissions in RDUs with confidence.
Furthermore, RDUs are representative of any hospital ward/clinic which necessitates that patients continue coming to hospital for a life-sustaining treatment, irrespective of quarantine or lockdown. The approaches developed for analysis of RDUs in these studies, only made possible by the fact that to stop dialysis would have led to the patients dying, could therefore be applied to monitoring and preventing SARS-CoV-2 transmission during chemotherapy and other life-sustaining treatments requiring hospital attendance.
Finally, while the approach used for integrating genomic and epidemiological data provides actionable information for investigating and tackling COVID-19 outbreaks, the SARS-CoV-2 virus has a relatively low evolutionary rate (approx. 0.8 x 10-3 substitutions/site/year). The genomic epidemiology analyses undertaken here would therefore be even more powerful for a virus with a higher evolutionary rate, such as influenza virus.
1. Meredith LW, Hamilton WL, Warne B, Houldcroft CJ, Hosmillo M, Jahun AS, Curran MD, Parmar S, Caller LG, Caddy SL, Khokhar FA, Yakovleva A, Hall G, Feltwell T, Forrest S, Sridhar S, Weekes MP, Baker S, Brown N, Moore E, Popay A, Roddick I, Reacher M, Gouliouris T, Peacock SJ, Dougan G, Török ME, Goodfellow I. Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study. Lancet Infect Dis. 2020 Jul 14: S1473-3099(20)30562-4. doi: 10.1016/S1473-3099(20)30562-4. PMID: 32679081.
2. Kathy K Li,Y. Mun Woo, , Antonia Ho1, Joseph Hughes, Oliver Stirrup, Alison H.M. Taylor, Zoe Cousland, Jonathan Price, Jennifer S. Lees, Timothy Jones, Carlo Varon Lopez, Scott T.W. Morris, Peter C. Thomson, Colin C Geddes, Jamie P. Traynor, Emma C. Thomson, Patrick B. Mark. Genetic epidemiology of COVID-19 infection in patients requiring haemodialysis renal replacement therapy in Scotland. Manuscript in progress. 2020
Example analytical report from Civet
Example analytical report from Civet
Tree 1 – 17 sequences of interest
Figure 2 – Nucleotide variation in sequences of interest
Figure 3 – The relative proportion of assigned UK-Lineages for samples collected and sequenced within the central healthboard region for the defined time-frame.
Figure 4 – The relative proportions of assigned UK-Lineages for samples collected from the focal, and neighbouring healthboard regions for the defined time-frame with the regional context. Plot-size demonstrates relative numbers of sequences across given NHS healthboards.