Abstract
Today, we are all threatened by an unprecedented pandemic: COVID-19. How different is it from other coronaviruses? Will it be attenuated or become more virulent? Which animals may be its original host? In this study, we analyzed 377 publicly available complete genome sequences for the COVID-19 virus, the previously known flu-causing coronaviruses (HCov-229E, HCov-OC43, HCov-NL63 and HCov-HKU1) and the lethal, pathogenic P3/P4 viruses, SARS, MERS, Victoria, Lassa, Yamagata, Ebola, and Dengue. We found strong similarities between the current circulating COVID-19 and SARS and MERS, as well as COVID-19 in rhinolophines and pangolins. On the contrary, COVID-19 shares little similarity with the flu-causing coronaviruses and the other P3/P4 viruses. Strikingly, we observed divergence of COVID-19 strains isolated from human hosts has steadily increased from December 2019 to March 2020, suggesting COVID-19 is actively evolving in human hosts. From all existing human COVID-19 genome sequences, we calculated the first common model that represents the shared sequences of the human COVID-19 strains, which provides important information for vaccine and antibody development. Geographic and time-course analysis of the evolutionary trees of the human COVID-19 reveals possibly heterogeneous evolutional paths among strains from 21 countries. This finding has important implications to the management of COVID-19 and the development of vaccines.