Published: 11th November 2020
New 'hidden' gene in COVID-19 virus identified, says new study
The research team identified ORF3d, a new overlapping gene in SARS-CoV-2 that has the potential to encode a protein that is longer than expected by chance alone
Researchers have identified a new "hidden" gene in SARS-CoV-2, the virus responsible for COVID-19, that may have contributed to its unique biology and pandemic potential.
In a virus that only has about 15 genes in total, knowing more about this and other overlapping genes — or "genes within genes" — could have a significant impact on how we combat the virus.
"Overlapping genes may be one of an arsenal of ways in which Coronaviruses have evolved to replicate efficiently, thwart host immunity, or get themselves transmitted," said lead author Chase Nelson, a postdoctoral researcher at Academia Sinica in Taiwan and a visiting scientist at the American Museum of Natural History.
"Knowing that overlapping genes exist and how they function may reveal new avenues for coronavirus control, for example through antiviral drugs."
The research team identified ORF3d, a new overlapping gene in SARS-CoV-2 that has the potential to encode a protein that is longer than expected by chance alone, according to the study published in the journal eLife.
They found that this gene is also present in a previously discovered pangolin coronavirus, perhaps reflecting repeated loss or gain of this gene during the evolution of SARS-CoV-2 and related viruses.
In addition, ORF3d has been independently identified and shown to elicit a strong antibody response in Covid-19 patients, demonstrating that the new gene's protein is manufactured during human infection.
"We don't yet know its function or if there's the clinical significance," Nelson said. "But we predict this gene is relatively unlikely to be detected by a T-cell response, in contrast to the antibody response. And maybe that has something to do with how the gene was able to arise." At first glance, genes can seem like written language in that they are made of strings of letters (in RNA viruses, the nucleotides A, U, G, and C) that convey information.
But while the units of language (words) are discrete and non-overlapping, genes can be overlapping and multifunctional, with information cryptically encoded depending on where you start "reading." Overlapping genes are hard to spot, and most scientific computer programmes are not designed to find them. However, they are common in viruses.
This is partly because RNA viruses have a high mutation rate, so they tend to keep their gene count low to prevent a large number of mutations. As a result, viruses have evolved a sort of data compression system in which one letter in its genome can contribute to two or even three different genes.
"Missing overlapping genes puts us in peril of overlooking important aspects of viral biology," said Nelson. "In terms of genome size, SARS-CoV-2 and its relatives are among the longest RNA viruses that exist. They are thus perhaps more prone to 'genomic trickery' than other RNA viruses."