Since its creation COG-UK has been supporting genomic surveillance efforts to identify variants of the SARS-CoV-2 virus in the genome sequencing data from the UK.
New SARS-CoV-2 variant
The variant described today in the House of Commons contains a novel set of mutations associated with a lineage spreading rapidly in the South East of England (and more widely) that is the subject of ongoing investigations by the UK Public Health Agencies, coordinated by Public Health England and supported by COG-UK. This variant carries a set of mutations including an N501Y mutation in the receptor binding motif of the Spike protein that the virus uses to bind to the human ACE2 receptor.
Efforts are under way to confirm whether or not any of these mutations are contributing to increased transmission. There is currently no evidence that this variant (or any other studied to date) has any impact on disease severity, or that it will render vaccines less effective, although both questions require further studies performed at pace. We will provide further updates as our investigations proceed.
How COG-UK is tracking emerging SARS-CoV-2 gene mutations
Explanation of terms:
- Mutation is used to describe a replacement of a base in the genome, or a deletion or insertion event.
- Viral variant refers to a distinct virus, which may have a combination of different mutations.
Mutations arise naturally in the SARS-CoV-2 genome as the virus replicates and circulates in the human population. These accumulate at a rate of around one to two mutations per month in the global phylogeny. As a result of this on-going process, many thousands of mutations have already arisen in the SARS-CoV-2 genome since the virus emerged in 2019. As mutations continue to arise, novel combinations are increasingly observed.
The vast majority of the mutations observed in SARS-CoV-2 have no apparent effect on the virus and only a very small minority are likely to be important and change the virus in any appreciable way (for example, a change in the ability to infect people; cause disease of different severity; or become insensitive to the effect of the human immune response including the response generated by a vaccine).
It is difficult to predict whether any given mutation is important when it first emerges, against a backdrop of the continuous emergence of new mutations. Understanding their significance may be possible based on experimental work that shows a link between the mutation and a subtle change in virus biology. However, it would take considerable time and effort to test the effect of many thousands of combinations of mutations. The biggest concern is any changes that lead to an increase in reinfections or vaccine failure (signalling that the virus may be evading the immune protection elicited by previous infection or vaccination).
Most attention is on mutations in the gene that encodes the Spike protein, which is associated with viral entry into cells. There are around 4000 mutations in the Spike protein gene at the present time (note that a different mutation can occur at the same point in the genome, which is why the number of mutations is greater than the actual number of bases in the Spike protein gene). A small number of mutations are in a region referred to as the receptor binding motif (RBM) of the Spike protein which is responsible for viral entry via its interaction with the receptor (hACE2) on host cells.
What is COG-UK doing?
COG-UK (whose partners include the four Public Health Agencies) undertake unselected (random) sequencing of positive samples across the UK and publish a sequencing Coverage Report, which is cascaded to the Public Health Agencies each week. A short version is open access and widely available https://www.cogconsortium.uk/data/. Random sampling is important to capture regional coverage (note that outbreak sequencing is also important, but if associated with a true outbreak, this activity will generate numerous genomes of the same variant, which does not add to the effort to identify variant viruses). Total viral sequence coverage across the UK is around 10% (that is, the proportion sequenced versus the proportion of positive tests)
Automated genome analysis
COG-UK has developed software to automatically analyse genomes for the presence of mutations, for example: http://cov-glue.cvr.gla.ac.uk/#/home.
COG-UK tracks the emergence of virus mutations over time, and the spread of these variants. We use two approaches:
- A data-driven approach uses automated statistical analyses, phylogenetic studies and daily working alignments to generate a report on mutations present and their frequency in the sequenced viral population.
- A targeted approach to look for and track mutations that have already been characterised in laboratory experiments as potentially important (for example, because they could affect the interaction with the human immune response).
COG-UK has developed a Summary Mutation Report, which will be released online each week as a companion to the Coverage Report. The first prototype will be released by 18th December, with improvements made over time. This will focus on mutations that are common, and/or are of existing or emerging interest. These reports will be caveated, since they represent a subset of true case numbers.
The role of other agencies
COG-UK is one component of a much larger end-to-end process, the component parts of which have been defined by Public Health England.
The genome data generated by COG-UK is open access and available to Public Health Agencies, who then merge genome data with detailed epidemiological and clinical data. It is this combination of data that allows Public Health Agencies to interpret the significance of mutations for human health. COG-UK do not have access to this detailed patient level data.
The genome data and Summary Mutation Report will be used by other organisations and groups to assess the possible biological significance of the mutations, and decide which to prioritise for rapid investigation in laboratory studies of virus behaviour and immunology.
There is a possibility that the roll-out of vaccination will lead to selection for mutations that allow the virus to escape from the effect of the vaccine. Public Health Agencies are central to the ongoing evaluation for this event and will require the effective and rapid detection of people who have had infection more than once, or else have vaccine failure, which could be explained by changes in the virus. Such cases will need to be prioritised to have their virus sequenced. This will depend on effective and rapid triage by the diagnostic testing system before their test is processed to allow for efficient and comprehensive capture of cases that require further testing by sequencing. Once a sample arrives at a sequencing site, it is possible to generate genome data within 24 hours. Given the diagnostic testing volume, pace and geographical distribution, it is not feasible to find samples from such cases only after the testing is complete.
COVID-19 Genomics UK (COG-UK)
The current COVID-19 pandemic, caused by the SARS-CoV-2, represents a major threat to health. The COVID-19 Genomics UK (COG-UK) consortium has been created to deliver large-scale and rapid whole-genome virus sequencing to local NHS centres and the UK government.
Led by Professor Sharon Peacock of the University of Cambridge, COG-UK is made up of an innovative partnership of NHS organisations, the four Public Health Agencies of the UK, the Wellcome Sanger Institute and twelve academic partners providing sequencing and analysis capacity. A full list of collaborators can be found here: https://www.cogconsortium.uk/about/. Professor Peacock is also on a part-time secondment to PHE as Director of Science, where she focuses on the development of pathogen sequencing through COG-UK.
COG-UK was established in March 2020 supported by £20 million funding from the UK Department of Health and Social Care (DHSC), UK Research and Innovation (UKRI) and the Wellcome Sanger Institute, administered by UK Research and Innovation.