Article
Version 1
Preserved in Portico This version is not peer-reviewed
Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy
Version 1
: Received: 16 May 2024 / Approved: 17 May 2024 / Online: 17 May 2024 (11:36:40 CEST)
A peer-reviewed article of this Preprint also exists.
Carels, N. Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy. Biology 2024, 13, 482. Carels, N. Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy. Biology 2024, 13, 482.
Abstract
RNA-seq faces persistent challenges due to the ongoing expanding array of data processing workflows, none of which have yet achieved standardization to date. It is imperative to determine which method most effectively preserves biological facts. Shannon entropy serves as a tool for depicting the biological status of a system. Thus, we assessed the measurement of Shannon entropy by several RNA-seq workflow approaches, employing RPKM and median normalization on paired samples of 475 TCGA RNA-seq datasets spanning eight different cancer types with 5-year overall survival rates ranging from 30% to 98%. Our analysis revealed that the RPKM normalization coupled with a threshold of log2 fold change ≥1 for indentifying differentially expressed genes, yielded the best results with a correlation coefficient of 0.91. We propose that Shannon entropy can serve as an objective metric for refining the optimization of RNA-seq workflows.
Keywords
RPKM; median normalization; benchmarking; entropy; PPI network; cancer; 5-year OS.
Subject
Biology and Life Sciences, Life Sciences
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment