How the novel coronavirus has evolved
How the novel coronavirus has evolved
The world is now dealing with a different type of SARS-CoV-2 than the one that emerged in China almost a year ago, with mutations creating at least seven strains of the virus so far.
As the coronavirus SARS-CoV-2 swept across the world and killed more than 1.5 million in the past year, it has mutated into several major groups, or strains, as it adapted to its human hosts. Mapping and understanding those changes to the virus is crucial to developing strategies to combat the COVID-19 disease it causes.
Reuters analysed over 185,000 genome samples from the Global Initiative on Sharing All influenza Data (GISAID), the largest database of novel coronavirus genome sequences in the world, to show how the global dominance of major strains has shifted over time.
The analysis shows there are currently seven main strains of the virus. The original strain, detected in the Chinese city of Wuhan in December 2019, is the L strain. The virus then mutated into the S strain at the beginning of 2020. That was followed by V and G strains. Strain G mutated yet further into strains GR, GH and GV. Several other infrequent mutations were collectively grouped together as strain O.
Weekly breakdown of over 185,000 virus samples
from around the world
Weekly breakdown of over 185,000 virus samples from around the world
Latest samples
are all G-type
strains
Original Wuhan
strain “L”
Weekly breakdown of over 185,000 virus samples from around the world
Latest samples
are all G-type
strains
Original Wuhan
strain “L”
Weekly breakdown of over 185,000 virus samples
from around the world
“The reason for looking at the genomics is to try and find out where it came from … in terms of trying to map out what we would expect for the pandemic, that information is critical,” South Australia’s chief health officer, Nicola Spurrier, said following an outbreak in the state in early November. Health officials initially locked down the state because they thought the outbreak was caused by a much more contagious strain of the virus. They lifted the lockdown a day later when it turned out that a pizza restaurant worker had lied about how he caught the disease.
The graphic above shows how the original L strain is almost gone, leaving G strains dominant in the current stage of the pandemic. That’s important because the G strains include one mutation that makes it easier for the spike proteins on SARS-CoV-2 to bind to receptors on human cells, potentially increasing the chances of infection and transmissibility of the virus.
Tracking mutations
A mutation is a change in an organism’s genetic material. When a virus makes millions of copies of itself and moves from host to host, not every copy is identical. These small mutations accumulate as the virus is passed on – and copied again and again.
Databases like GISAID can track these changes in individual samples, allowing scientists to connect the dots with other samples and determine when major new strains form.
The GISAID database mapped out about 3,500 of these samples from all over the world, constructing a family tree that shows how they are related. A visualisation of the data illustrates the relationships among the samples and where new strains emerged.
Major groupings
Colours represent strain. Data as of Nov. 3
Original Wuhan
strain in red
Case one
Common ancestor based
on backtracking of mutations
Dots represent
virus samples
By deconstructing
the sequences,
scientists can
fill in the gaps to
retrace the lineage
of the virus samples.
Major groupings
Colours represent strain. Data as of Nov. 3
Original Wuhan
strain in red
Case one
Common ancestor based
on backtracking of mutations
Dots represent
virus samples
By deconstructing
the sequences,
scientists can fill
in the gaps to
retrace the lineage
of the virus samples.
Major groupings
Colours represent strain. Data as of Nov. 3
Original Wuhan
strain in red
Case one
Common ancestor based
on backtracking of mutations
By deconstructing the sequences,
scientists can fill in the gaps to retrace
the lineage of the virus samples.
Actual virus samples
and their relationships
GV
Latest strain
dominating in Europe
Major groupings
Colours represent strain. Data as of Nov. 3
Original Wuhan
strain in red
Case one
Common ancestor based
on backtracking of mutations
Dots represent
virus samples
By deconstructing
the sequences,
scientists can
fill in the gaps
to retrace the
lineage of the
virus samples.
Major groupings
Colours represent strain. Data as of Nov. 3
Original Wuhan
strain in red
Case one
Common ancestor based
on backtracking of mutations
By deconstructing the sequences,
scientists can fill in the gaps to
retrace the lineage of the virus samples.
Actual virus samples
and their relationships
GV
Latest strain
dominating in Europe
Shifting strains
Earlier in the pandemic, the virus made its way relatively quickly around the world, being repeatedly introduced to different locations and sparking fresh outbreaks regularly. During that time, there was a more diverse mixture of strains among the samples reported to GISAID. As countries began to close their borders, there were fewer new strains introduced. In countries where the more resilient G-type strains were present, they began to dominate.
However, the timing and rate of evolution into new strains occurred at different stages for different countries and regions. Those differing patterns largely reflected how quickly the virus was able to spread in any given region and whether an outbreak was sparked by an “imported” case of the virus.
Highest proportion of Wuhan “L” strain in early weeks.
Rapid shift to the three G-type strains
Region with highest portion of “O”, or “other”, strains circulating recently.
All recent samples show “GH” strain.
G-type strains dominate in Africa
Highest proportion of Wuhan “L” strain in early weeks.
Region with highest portion of “O”, or “other”, strains circulating recently
Rapid shift to the three G-type strains
All recent samples show “GH” strain.
G-type strains dominate in Africa
High proportion of Wuhan “L” strain in early weeks
Highest portion of “O”, or “other”, strains circulating recently
Rapid shift to three G-type strains
All recent samples show “GH” strain
G-type strains dominate in Africa
Highest proportion of Wuhan “L” strain in early weeks.
High portion of “O”, or “other”, strains
circulating recently.
Rapid shift to the three G-type strains
All recent samples show “GH” strain.
G-type strains dominate in Africa
Highest proportion of Wuhan “L” strain in early weeks.
Region with highest portion of “O”, or “other”, strains circulating recently.
Rapid shift to the three G-type strains
All recent samples show “GH” strain.
G-type strains dominate in Africa
In Asia, the original L strain persisted for longer as several countries, including China, were quick to shut borders and curtail movement. In contrast, North America and Europe did not restrict movement as much, at least initially, which allowed the G strains to spread – and mutate – at a faster pace.
“A lot of it comes down to place and getting a foothold in a new population,” said Catherine Bennett, epidemiology chair in the Faculty of Health at Melbourne’s Deakin University. “This virus moves in superspreader events, which means the virus doesn’t have to be particularly contagious. We will see different patterns because of cluster transmission.”
“O”, or “other” strains
No samples
available as
infections
also drop
Samples from recent sporadic
outbreaks were “G” strain
“O”, or “other” strains
Samples from recent
sporadic outbreaks
were “G” strain
No samples
available as
infections
also drop
“O”, or “other” strains
Samples from recent
sporadic outbreaks
were “G” strain
No samples
available as
infections
also drop
“O”, or “other” strains
Samples from recent sporadic
outbreaks were “G” strain
G strains take over
G strains are now dominant around the world. One specific mutation, D614G, has become the most common variant. It is so named because one amino acid is changed from a D (aspartate) to a G (glycine) at the 614th position on the viral spike proteins, the structure that gives the virus its crown-like appearance.
The rise of the G strains coincided with spikes in outbreaks of the virus around the world, with a clutch of new cases allowing the strains to invade new areas. The dominance of the G strains is illustrated by the data for Australia, Japan and Thailand. During Australia’s second wave of infections, G strains were present in almost all samples, indicating the country had effectively eliminated transmission of the earlier L and S strains through a series of social distancing measures. All of Australia’s second wave clusters were sparked by people who had returned from overseas and breaches in quarantine.
No samples
available as
infections
also drop
No samples
available as
infections
also drop
No samples
available as
infections
also drop
No samples
available as
infections
also drop
No samples
available as
infections
also drop
Major epicentres
The dominance of the G-strains becomes even more evident when looking at some countries with the most infections.
The United States is leading the overall number of infections and deaths by far. The majority of infections and first, second, and third waves all coincide with the increase in samples showing three G strains.
In India, a similar pattern can be observed as the constant increase in infections from June to September seemed to follow the curve of the G strain samples.
The new strain
The most recent mutation to emerge is the GV strain, which has so far been isolated to Europe where it has become increasingly common in recent weeks. GISAID scientists said the variant has a mutation in the protein spike, but in this case it may have little effect on the virus’ ability to bind to human cells. Experts say it is currently unclear whether the GV strain is spreading because of any transmission advantage or because it affected socially active young adults and tourists over the summer.
Outliers
Some countries bucked the general trend for a progression – albeit at varying rates – from the L to the G strains. In some cases, insufficient sample data was submitted to GISAID to detect a pattern. However, some other countries simply failed to follow the overarching shift to G-type strains.
Singapore, for example, recorded a significant number of O strains – virus variants that did not develop into sustained lineages – for several weeks. Deakin University’s Bennett said that likely reflected the fact that most of Singapore’s outbreaks were in separate foreign worker dormitories and quickly contained to those facilities.
In South Korea, the V strain became dominant for a period linked to a huge cluster of cases at a religious sect in the city of Daegu. South Korea is also at the centre of global efforts to research the potential of reinfection with a different strain of the virus after reports in April that scores of people who had recovered from COVID-19 later tested positive again. Health officials at the time said they suspected it was due to tests picking up remnants of the dead virus. Since then, there have been documented reports of individuals being reinfected with different versions of the virus. In a recently published paper in the journal Clinical Infectious Diseases, researchers from Seoul National University Hospital used computerised analysis to show that one woman was first infected with the V strain and later reinfected with a G strain.
“O” or
“other” strains
“O” or
“other” strains
“O” or
“other” strains
“O” or
“other” strains
“O” or
“other” strains
Why mutations matter
The mutations that give rise to new strains occur when the SARS-CoV-2 virus makes copies of itself inside a new host. The virus’ genome is a complete set of genetic instructions that is written in 30,000 “letters” of code. Different sections of the genome guide how different parts of the virus, such as structural proteins of the shell or non-structural proteins that impact replication, are constructed when the virus replicates in host cells.
Small mutations in the virus’s genome are normal as it is copied over and over. The GISAID database identified thousands of changes along the genome. Many are harmless but it’s virtually impossible for scientists to predict when and how a mutation can result in a strain of a virus that is more transmissible or impervious to proposed vaccines.
The diagram below shows the various regions of the viral genome and the corresponding parts of the virus they encode, as well as the many mutations recorded in each genome region.
Areas of diversity in samples
Structural proteins
The envelope,
membrane, and nucleocapsid
Places with high amounts of mutation
Spike proteins
Protrude from the viral envelope and allow it to attach to healthy cells
D614G
Now widespread. Affects the virus’s spike protein and believed to increase infectiousness
Non-structural proteins
Do not form the physical structure of the virus but regulate other aspects of the virus
THE GENOME
30,000 nucleotides long
Non-structural proteins
Do not form the physical structure of the virus but regulate other aspects of the virus
Other structural proteins
The envelope,
membrane, and nucleocapsid
Spike proteins
Protrude from the viral envelope and allow it to attach to healthy cells
D614G
This now widespread
mutation affects the virus’s spike protein and is believed to increase infectiousness
WHERE IT MUTATES
Locations along the genome and the amount
of mutation among the samples in the database
Places with high amounts of mutation or diversity
A222V
New mutation in the “GV” strain currently circulating in Europe
THE GENOME
30,000 nucleotides long
Non-structural proteins
Do not form the physical structure of the virus but regulate other aspects of the virus
Other structural proteins
The envelope,
membrane, and nucleocapsid
Spike proteins
Protrude from the viral envelope and allow it to attach to healthy cells
WHERE IT MUTATES
Locations along the genome and the amount
of mutation among the samples in the database
D614G
This now widespread
mutation affects the virus’s spike protein and is believed to increase infectiousness
Places with high amounts of mutation or diversity
A222V
New mutation in the “GV” strain currently circulating in Europe
Areas of diversity in samples
Structural proteins
The envelope,
membrane, and nucleocapsid
Places with high amounts of mutation
Spike proteins
Protrude from the viral envelope and allow it to attach to healthy cells
D614G
Now widespread. Affects the virus’s spike protein and believed to increase infectiousness
Non-structural proteins
Do not form the physical structure of the virus but regulate other aspects of the virus
THE GENOME
30,000 nucleotides long
Non-structural proteins
Do not form the physical structure of the virus but regulate other aspects of the virus
Other structural proteins
The envelope,
membrane, and nucleocapsid
Spike proteins
Protrude from the viral envelope and allow it to attach to healthy cells
WHERE IT MUTATES
Locations along the genome and the amount
of mutation among the samples in the database
D614G
This now widespread
mutation affects the virus’s spike protein and is believed to increase infectiousness
Places with high amounts of mutation or diversity
A222V
New mutation in the “GV” strain currently circulating in Europe
Cautious optimism
The SARS-CoV-2 virus has so far mutated slowly, allowing scientists and policy makers to keep on top of its progress. Still, scientists have been divided on the implications of some of the mutations. Some experts have reported that the D614G variation has made the virus more transmissible, however other studies have contradicted that.
Either way, the changes so far have not resulted in strains that would likely be resistant to vaccines in development. In fact, one study by a group of scientists from several institutions including the University of Sheffield and Harvard University found that G strains might present an easier target for a vaccine because these strains have more spike proteins on their surface, which are the target of vaccine-induced antibodies.
“Fortunately, we found that none of these mutations are making COVID-19 spread more rapidly, but we need to remain vigilant and continue monitoring new mutations, particularly as vaccines get rolled out,” said University College of London Genetics Institute researcher Lucy van Dorp, co-author of a study that identified more than 12,700 mutations in the SARS-CoV-2 virus.
Still, experts who have watched influenza and HIV mutate over years, evading vaccines, warn that future mutations of SARS-CoV-2 remain unknown. And the best shot at avoiding changes that make the virus impervious to a vaccine remains curtailing its spread and reducing the opportunities it has to mutate.
“If the virus changes substantially, particularly the spike proteins, then it might escape a vaccine. We want to slow transmission globally to slow the clock,” said Deakin’s Bennett. “That reduces the chances of a one in a squillion change that’s awful news for us.”
Sources:
GISAID
By Jitesh Chowdhury, Simon Scarr and Jane Wardell
Editing by Christine Soares and Tiffany Wu