Preprint
Article

Witnessing Evolution of SARS-CoV-2 through Comparative Phylogenomics: The Proximate Origin is Guangdong, not Wuhan

Altmetrics

Downloads

1258

Views

3060

Comments

1

Submitted:

17 June 2020

Posted:

21 June 2020

You are already at the latest version

Alerts
Abstract
A new form of coronavirus called severe acute respiratory disease coronavirus type 2 (SARS-CoV-2) is currently causing a pandemic. A six-month evolutionary history of SARS-CoV-2 is witnessed by characterising the total genome of 821 samples using comparative phylogenomic approaches. Our analyses produced striking inclusive results that may guide scientists/professionals for the past/future of pandemic. Phylogenetic and time estimation analyses suggest the proximate origin of pandemic strain as Guangdong and the origin time as first half of September 2019, not Wuhan and December 2019, respectively. The viral genome experienced a substitution rate similar to other RNA viruses, but it is particularly high in some of the peptides encoding sequences such as leader protein, E gene, orf8, orf10, nsp10, N gene, S gene and M gene and nsp4, while low in nsp11, orf7a, 3C-like proteinase, nsp9, nsp8 and endoRNase. Most strikingly, the divergence rate of amino acid sequences is high proportional to nucleotide divergence. Additionally, specific non-synonymous mutations in nsp3 and nsp6 evolved under positive selection. The exponential growth rate (r), doubling time (Td) and R0 were estimated to be 47.43 per year, 5.39 days and 2.72, respectively. Comparison of synapomorphies distinguishing the SARS-CoV-2 and the candidate ancestor bat coronavirus indicates that mutation pattern in nsp3 and S gene enabled the new strain to invade human and become a pandemic strain. We arrive at the following main conclusions: (i) six months evolution of viral genome is nearly neutral, (ii) origin of pandemic is not Wuhan and predates formal reports, (iii) although viral population is ongoing an exponential growth, the doubling time is evolving towards shortening, and (iv) divergence rate of total genome is similar to other RNA viruses, but it is prominently high in some genes while low in some others and evolution in these genes should be closely monitored as their protein products intervening to pathogenicity, virulence and immune response.
Keywords: 
Subject: Biology and Life Sciences  -   Biochemistry and Molecular Biology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated