17 Mar 2023

A look back to the detection of the first variant of concern, the Alpha variant

As we approach the closure of COG-UK, we reflect on some of the key moments on the pandemic and the genomic sequencing of its variants, focusing on the first variant of concern identified (Alpha).

What is the Alpha variant?
The Alpha variant was the first SARS-CoV-2 variant of concern (VOC) to be detected worldwide. At the time of its identification, this term was not yet in use and the system of describing variants of interest and concern had not been developed. Alpha contained fourteen lineage-specific amino acid replacements and three deletions compared to its contemporaneous lineages, which at the time of its emergence was unprecedented in the global SARS-CoV-2 virus genomic data set. Alpha was established to be around 50% more transmissible than other contemporaneous variants. The Alpha variant went on to spread across the world and to replace most other circulating variants, prior to its own replacement by other variants of concern, in particular Delta.

Do we know the origin of the Alpha variant?
No. There are several alternate hypotheses that might explain how Alpha acquired a large number of mutations: an under-sampled geographical location, a non-human animal population, or a chronically infected individual. The latter provides the best explanation of the observed behaviour and dynamics of the variant, although the individual need not be immunocompromised as persistently infected immunocompetent hosts also display a higher within-host rate of viral evolution. The UK sequenced a high number of SARS-CoV-2 genomes across the course of the pandemic, meaning that we were likely to detect variants of concern that were circulating in the UK population.  When Alpha emerged, a significant proportion of countries worldwide did not have sequencing capabilities. Whilst this variant was initially dubbed the Kent Variant, it is not possible to exclude that it emerged outside the UK and was imported into the country.

What did we know about mutations and whether a mutation could make the virus more transmissible at that time that Alpha emerged?
Very little. In early December 2020, a mutation termed N501Y in the spike protein was found to be a feature of a new and concerning variant in South Africa. At the same time, N501Y was identified in the ‘Kent cluster’. Whether this mutation led to greater viral transmissibility was speculative at that point. Genome data provides a means of categorising variants and can detect where and when mutations have occurred, but it is only in the context of how that variant behaves in the population that we can establish whether a given variant spreads more quickly. Over time and since Alpha emerged, a comprehensive list of mutations that are linked with changes in biological behaviour (e.g. greater spread, immune evasion) has been developed, which helps to identify variants that require closer observation and investigation (variants of interest). Accruing such information represents meticulous cataloguing as well as the benefit of hindsight, but at the time of the emergence of the Alpha variant none of this was known.

What other variants were circulating around the time that the Alpha variant emerged?
There were many hundreds of variants causing COVID-19 in the UK before Alpha emerged (see graph below), none of which were able to out-compete other circulating variants. Alpha was replaced by Delta in the summer of 2021.

Figure shows the Alpha variant (green), variants that were detected before Alpha, and the replacement of Alpha with Delta. Other variants prior to Alpha (pink) were numerous in number but are shown as a single colour for clarity. Data is from COG-UK-ME and is based on all UK genomes that were uploaded into the CLIMB database, and so include community and hospital samples from people with COVID-19.


What led the Alpha variant to be detected?
In early December 2020, Public Health England (PHE, now known as the UK Health Security Agency) began tracking and investigating a rapid increase in COVID-19 incidence in South East England, centred on Kent and East London. New cases numbers were growing more rapidly than expected over the previous four weeks, despite an elevated level of non-pharmaceutical interventions in the region. A corresponding genomic cluster was detected separately within the COVID-19 Genomics UK (COG-UK) Consortium genomic surveillance data set, and the genome sequences carried a substantially larger than usual number of genetic changes. At a routine PHE meeting on the 8th of December 2020, the link between the genomic cluster and the Kent epidemiological situation was made, and investigations were initiated rapidly to characterise the mutations and estimate the growth rate of the cluster.

Evidence accumulated that this cluster was growing rapidly and had expanded throughout November, during a national lockdown in England. The cluster was designated B.1.1.7 under the Pango lineage naming system and was later labelled as variant of concern Alpha under the World Health Organization (WHO) variant nomenclature. On 17 December, SAGE noted that: “A new variant of SARS-CoV-2 has been identified in the South-East of England, with an N501Y and other mutations. There are indications that this variant may be spreading more quickly than others but the extent of any increase in transmissibility is not yet known.” An extraordinary meeting of NERVTAG on 21 December 2020 considered all of the scientific evidence and concluded that “The committee therefore has high confidence that B.1.1.7 can spread faster than other SARS-CoV-2 virus variants currently circulating in the UK”. The initial estimates of increased transmissibility were confirmed by multiple studies in the UK and elsewhere.

What did we know about the Alpha variant before December 2020?
On a look-back exercise, isolated cases were observed in our genomic dataset in September 2020. During the month of September, a total of 5 genomes were assigned to Alpha out of 16295 genomes generated during that month (a frequency of 0.03%). During September 2020, there were more than 200 different variants detected by genome sequencing in the UK (lineages and sub-lineages are included in this count and is an approximation only, as lineage assignment is a dynamic process). During October 2020, 88 Alpha genomes were detected from the 31761 genomes generated during that month (0.28%), and more than 200 variants were in circulation that month. The proportion of samples we sequenced that contained the Alpha variant began to trend upwards in November (see graph below). However, what mattered was not the absolute numbers but the increased disease frequency due to transmission in specific areas (e.g. Kent) that went against the national trend.

Wellcome Sanger Institute COVID-19 Genomics Surveillance showing detection of the Alpha variant following its emergence in England. Data are based on sequencing of samples from the Lighthouse laboratories.


Who had access to the genome data?
The genomes generated by COG-UK were uploaded to the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) as soon as technically possible, when the data became available to all 21 consortium partners including the UK public health agencies, together with many other UK scientists. The genome data was also made publicly available in the international database GISAID, and the European Bioinformatics Institute nucleotide archive. This reflected our open access policies and transparency with which we shared data as soon as possible. In the autumn of 2020, the technical processes required for the end-to-end sequencing and analysis pathway (including sequencing and sequence quality checks, combining the genome data with the anonymised patient identifier, and uploading to CLIMB for data storage and analysis) took around two weeks, and could take longer. The Consortium focused on reducing this turnaround time to less than a week through a process of continuous improvement.

COVID-19 Genomics UK (COG-UK)

The COVID-19 Genomics UK (COG-UK) consortium works in partnership to harness the power of SARS-CoV-2 genomics in the fight against COVID-19.

Led by Professor Sharon Peacock of the University of Cambridge, COG-UK is made up of an innovative collaboration of NHS organisations, the four public health agencies of the UK, the Wellcome Sanger Institute and sixteen academic partners. A full list of collaborators can be found here.

The COVID-19 pandemic, caused by SARS-CoV-2, represents a major threat to health. The COG-UK consortium was formed in March 2020 to deliver SARS-CoV-2 genome sequencing and analysis to inform public health policy and to support the establishment of a national pathogen sequencing service, with sequence data now predominantly generated by the Wellcome Sanger Institute and the Public Health Agencies.

SARS-CoV-2 genome sequencing and analysis plays a key role in the COVID-19 public health response by enabling the identification, tracking and analysis of variants of concern, and by informing the design of vaccines and therapeutics. COG-UK works collaboratively to deliver world-class research on pathogen sequencing and analysis, maximise the value of genomic data by ensuring fair access and data linkage, and provide a training programme to enable equity in global sequencing.