Preprint
Article

DVGfinder: A Metasearch Engine for Identifying Defective Viral Genomes in RNA-Seq Data

Altmetrics

Downloads

346

Views

346

Comments

0

Submitted:

03 March 2022

Posted:

07 March 2022

You are already at the latest version

Alerts
Abstract
The generation of different types of defective viral genomes (DVG) is an unavoidable consequence of the error-prone replication of RNA viruses. In recent years, a particular class of DVGs, those containing long deletions or genome rearrangements, has gain interest due to their potential therapeutic and biotechnological applications. Identifying such DVGs in high-throughput sequencing data has become an interesting computational problem. Up to nowadays, several algorithms have been proposed, though all incur in false positives, a problem of practical interest if such DVGs have to be synthetized and tested in the laboratory. Here we develop a novel software, DVGfinder, that wraps the two most commonly used algorithms into a pipeline that predicts DVGs. Using a gradient boosting classifier machine learning algorithm, we evaluate the performance of DVGfinder compared to previous algorithms and found that it outcompetes their precision and sensitivity in simulated datasets. DVGfinder generates user-friendly output files in HTML format that can assist users to identify DVGs based on their associated probability of being true positives.
Keywords: 
Subject: Biology and Life Sciences  -   Virology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated