Preprint
Article

Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research

This version is not peer-reviewed.

Submitted:

01 March 2019

Posted:

04 March 2019

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
The rapid expansion of transcriptomics from increased affordability of next-generation sequencing (NGS) technologies generates rocketing amounts of gene expression data across biology and medicine, and notably in cancer research. Concomitantly, many bioinformatics tools were developed to streamline gene expression analysis and quantification. We tested the concordance of NGS RNA sequencing (RNA-seq) analysis outcomes between the two predominant programs for reads alignment, HISAT2 and STAR, and the two most popular programs for quantifying gene expression in NGS experiments, edgeR and DESeq2, using RNA-seq data from a series of breast cancer progression specimens, which include histologically confirmed normal, early neoplasia, ductal carcinoma in situ and infiltrating ductal carcinoma samples microdissected from formalin fixed, paraffin embedded (FFPE) breast tissue blocks. We identified significant differences in aligners’ performance: HISAT2 was prone to misalign reads to retrogene genomic loci, STAR generated more precise alignments, especially for early neoplasia samples. edgeR and DESeq2 produced similar lists of differentially expressed genes in stage comparisons, with edgeR producing more conservative, though shorter, lists of genes. Albeit, Gene Ontology (GO) enrichment analysis revealed no skewness in significant GO categories identified among differentially expressed genes by edgeR vs DESeq2. As transcriptome analysis of archived FFPE samples becomes a vanguard of precision medicine, identification and fine-tuning of bioinformatics tools becomes critical for clinical research. Our results indicate that STAR and edgeR are well-suited tools for differential gene expression analysis from FFPE samples.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

358

Views

357

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated