Preprint
Review

The Reuse of Public Datasets in the Life Sciences: Potential Risks and Rewards

Altmetrics

Downloads

665

Views

920

Comments

0

This version is not peer-reviewed

Submitted:

10 February 2020

Posted:

11 February 2020

Read the latest preprint version here

Alerts
Abstract
The 'big data revolution' has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the challenges, limitations and risks associated with it. Due to the prominence, abundance and wide distribution of sequencing results, we focus on the reuse of publicly available sequence datasets. Through selected examples of successful reuse of different data (genome, transcriptome, proteome, metabolome, phenotype and ecosystem), with their respective limitations and risks, we illustrate the enormous potential of the practice. A checklist to determine the reuse value and potential of particular dataset is also provided.
Keywords: 
Subject: Biology and Life Sciences  -   Biochemistry and Molecular Biology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated