ABOVE: © ISTOCK.COM, DESIGN CELLS
The genomic stability of SARS-CoV-2 that scientists had first expected has been disrupted by the emergence of different variants over the course of the COVID-19 pandemic. The N-terminal domain (NTD) of the virus’s spike protein has appeared as a potentially mutable structure—scientists have reported it has deletion-prone regions that may allow the virus to escape antibody neutralization. According to a preprint posted to medRxiv June 12, the prevalence of these deletions increased during surges of COVID-19 cases worldwide. The study’s authors also report the presence of NTD deletions in SARS-CoV-2 samples from COVID-19 patients who had either been infected before or who were already fully vaccinated.
The team hypothesizes these deletions could assist the virus in evading immunity, potentially playing a role in surges and vaccine breakthrough infections. The ideas in this manuscript are thought-provoking, says virologist Kevin McCarthy of the University of Pittsburgh who was not involved in this work.
“The epidemiology community has almost exclusively focused its efforts on the PCR positivity,” mainly tracking COVID-19 transmission in real time, but analyses on these numbers have barely been linked to genomic data on emerging variants, says Venky Soundararajan, the chief scientific officer of the artificial intelligence company nference and a coauthor of the paper. He and his colleagues saw a need to integrate the epidemiological data with genomic analysis in order to determine whether certain mutations could be associated with surges in cases.
The authors observed an NTD deletion that increased more than 13-fold from February to April 2021 in a surging SARS-CoV-2 variant in India.
They first analyzed 1.57 million SARS-CoV-2 genome sequences from 187 countries or territories for the period December 2019 to April 2021, sourced from the Global Initiative on Sharing All Influenza Data (GISAID) database. The analysis revealed that 857 amino acid mutations in the spike protein have emerged over the course of the pandemic—each of which was present in at least 100 sequences. Most of them (816) were substitutions, and 37 were deletions.
The authors asked which of those mutations have increased in prevalence during three-month intervals of increasing PCR test positivity in any given country. The team found that 48.6 percent of the deletions coincided with these surges in COVID-19 cases, while only 8.6 percent of the substitutions fell in this category. All 18 surge-associated deletions occurred in the NTD of the spike protein, clustering in sites targeted by neutralizing antibodies. According to the authors, these deletions in SARS-CoV-2 could be fueling worldwide surges.
The work lines up with an analysis from earlier this year of nearly 150,000 SARS-CoV-2 sequences obtained from the GISAID database, which revealed four discrete sites within the NTD that have recurrent deletions in antigenic sites. That study, led by McCarthy, has been followed by structural and biochemical work showing that the NTD is recognized by human neutralizing antibodies. The current investigation found that NTD deletions occurred mostly within six regions—the four previously described and two newly reported by the nference team.
See “SARS-CoV-2 with Genomic Deletions Escapes an Antibody”
According to Soundararajan, they found surge-associated NTD deletions in at least 12 countries in Asia, South America, Europe, and Africa. In the preprint, the authors mainly focus on the recent surges in India and Chile. They observed an NTD deletion that increased more than 13-fold from February to April 2021 in a surging SARS-CoV-2 variant in India. In Chile, an NTD deletion rose 38-fold in prevalence from January to April 2021. This deletion is harbored by one of the two main variants circulating in the country that belongs to the C.37 lineage.

Regarding the correlation between deletions and test positivity, McCarthy says it’s hard to tell whether the mutations or the surges came first. “When you have a lot of cases spreading, you also have more viral replication, so you can get more mutations,” he explains. He also warns that the emphasis of some countries on searching for specific variants can skew the genomic sampling in the database. For instance, if a COVID-19 case had contacts associated with a variant of interest, that viral genome could be more likely to get sequenced, he explains. Mutations in such variants could be overrepresented in the database.
Theodora Hatziioannou, a virologist at the Rockefeller University in New York who was not involved in this work, says it is an interesting study, but advises caution in interpreting the results. “The surge is not dependent just on the virus,” she says, but also on many other factors—for instance, human behavior.
Clinical consequences of NTD deletions
Virologist Ricardo Soto Rifo of the University of Chile who did not participate in the study says it “highlights the relevance of the NTD” as an important antigenic site for neutralizing antibodies. He says that the association of the NTD deletions with surges in India and Chile requires further investigation, especially to understand the effects of these mutations on the clinical manifestations of the disease.
The nference team joined efforts with the Mayo Clinic in Minnesota to explore the potential role of NTD deletions in COVID-19 patients with natural or vaccine-induced immunity. They analyzed the viral genomes from 53 patients, 20 with vaccine breakthrough infections and 33 with suspected reinfections. They observed that 42 of these patients had at least one deletion in the NTD compared to the wildtype reference sequence of SARS-CoV-2, Wuhan-Hu-1, and 4 of them showed a contiguous stretch of 3–9 amino acids deleted. The authors speculate these deletion patterns could allow the virus to evade acquired immunity.
Recent investigations have shown that NTD induces an immune response, but “we don’t know yet what that means for vaccine breakthrough,” says McCarthy. “I think studies like this are important to begin putting those pieces together to understand this.” He says that it’s probably going to take thousands of breakthrough infections and a lot more sequencing to determine the full significance of NTD mutations.
Soundararajan says that more than 60 percent of the total sequenced genomes of SARS-CoV-2 are from the US or the UK—a major limitation for understanding the evolution of the virus worldwide. “This is a global outbreak,” he says. “What is happening in Chile [and] India matters to the US, matters to the UK. . . . We need to invest in genome sequencing in different parts of the world, beyond our boundaries, because the whole world is one community.”