Preprint
Article

Creating Variant Features to Enhance Covid-19 Predictions with Machine Learning Ensemble

Altmetrics

Downloads

284

Views

430

Comments

0

This version is not peer-reviewed

Submitted:

20 January 2022

Posted:

21 January 2022

You are already at the latest version

Alerts
Abstract
Covid-19 has caused infections and deaths worldwide. While research in the field of Data Science has contributed good predictions of positive Covid-19 case numbers, this study's review of literature shows there is little research in the use of variants of the virus in predictions. We set out to define and evaluate novel variant features. We find that features relating to variant trends, thresholds and amino acid substitutions are especially powerful in two tasks. In the first task, predicting Covid-19 case numbers, accuracy improved from 71.53% without variant features to 82.12% with variant features. In the second task, predicting transmission severity of variants between two classes, we created a method to build some variable ensembles through selecting appropriate models that are generated with variant features. The test results showed that our ensembles are more accurate and reliable. One particular ensemble of 14 models correctly classified 90.91% of variants, outperforming other models including the popular Random Forest ensemble. In addition, as the variant features have represented more underlying information about Covid-19 pathophysiology, our ensemble methods use only a few data samples to achieve an accurate prediction. The ensemble of 14 models uses only 50 cases of each variant, an ability that could be exploited for early detection of highly infectious variants. These research findings may benefit public health professionals, policy makers, and the research community in the collective efforts to overcome this disease.
Keywords: 
Subject: Computer Science and Mathematics  -   Artificial Intelligence and Machine Learning
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated