Version 1
: Received: 1 June 2020 / Approved: 2 June 2020 / Online: 2 June 2020 (09:24:25 CEST)
How to cite:
Hejblum, B. P.; Kunzmann, K.; Lavagnini, E.; Hutchinson, A.; Robertson, D. S.; Jones, S. C.; Eckes-Shephard, A. H. Realistic and Robust Reproducible Research for Biostatistics. Preprints2020, 2020060002. https://doi.org/10.20944/preprints202006.0002.v1
Hejblum, B. P.; Kunzmann, K.; Lavagnini, E.; Hutchinson, A.; Robertson, D. S.; Jones, S. C.; Eckes-Shephard, A. H. Realistic and Robust Reproducible Research for Biostatistics. Preprints 2020, 2020060002. https://doi.org/10.20944/preprints202006.0002.v1
Hejblum, B. P.; Kunzmann, K.; Lavagnini, E.; Hutchinson, A.; Robertson, D. S.; Jones, S. C.; Eckes-Shephard, A. H. Realistic and Robust Reproducible Research for Biostatistics. Preprints2020, 2020060002. https://doi.org/10.20944/preprints202006.0002.v1
APA Style
Hejblum, B. P., Kunzmann, K., Lavagnini, E., Hutchinson, A., Robertson, D. S., Jones, S. C., & Eckes-Shephard, A. H. (2020). Realistic and Robust Reproducible Research for Biostatistics. Preprints. https://doi.org/10.20944/preprints202006.0002.v1
Chicago/Turabian Style
Hejblum, B. P., Sacha C. Jones and Annemarie H. Eckes-Shephard. 2020 "Realistic and Robust Reproducible Research for Biostatistics" Preprints. https://doi.org/10.20944/preprints202006.0002.v1
Abstract
The complexity of analysis pipelines in biomedical sciences poses a severe challenge for the transparency and reproducibility of results. Researchers are increasingly incorporating software development technologies and methods into their analyses, but this is a quickly evolving landscape and teams may lack the capabilities to set up their own complex IT infrastructure to aid reproducibility. Basing a reproducible research strategy on readily available solutions with zero or low set-up costs whilst maintaining technological flexibility to incorporate domain-specific software tools is therefore of key importance. We outline a practical approach for robust reproducibility of analysis results. In our examples, we rely exclusively on established open-source tools and free services. Special emphasis is put on the integration of these tools with best practices from software development and free online services for the biostatistics domain.
Keywords
Biostatistics; Data management; Reproducibility; Workflow automation
Subject
Computer Science and Mathematics, Software
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Received:
25 June 2020
Commenter:
Jeremy Leipzig
The commenter has declared there is no conflict of interests.
Comment:
I thought this is a decent article and covers most of the basic components of reproducibility. Great sections on RDM. It plays a little fast and loose with the workflow stuff and adds some niche solutions as though they were ubiquitous (Cyverse and Gigantum are great but neither dominates the scene). Definitely a review worth reading.
Commenter: Jeremy Leipzig
The commenter has declared there is no conflict of interests.