Preprint
Article

The Utility of Data Transformation for Alignment, de novo Assembly and Classification of Short Read Virus Sequences

Altmetrics

Downloads

288

Views

372

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

30 March 2019

Posted:

01 April 2019

You are already at the latest version

Alerts
Abstract
Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work we explore the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Despite using highly compressed sequence transformations to accelerate the processes, our sequence processing approach yielded comparable accuracy to existing approaches, and are ideally suited for sequences originating from highly diverse virus populations. We demonstrate the application of our methodology to both synthetic and real viral pathogen sequence data. Our results show that the use of highly compressed sequence approximations can provide accurate results and that useful analytical performance can be retained and even enhanced through appropriate dimensionality reduction of sequence data.
Keywords: 
Subject: Biology and Life Sciences  -   Virology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated