A century ago, Sir R.A. Fisher introduced for the first time the concept of variance in biological studies. In this paper, we present a few new, modified, or integrated perspectives of variance that we feel would contribute to future thinking and practice of data science. We do so by focusing on brain and behavioral data, through which we hope one could extrapolate the discussions to other fields and data. Specifically: (1) We define different types of variation. (2) We demonstrate that both classic regression models and advanced statistical methods can be viewed as variance-decomposition methods. (3) We make a distinction between innate and acquired variability, linked through Bayesian updating. (4) We review and illustrate how to extract information from high-dimensional data and how to visualize them. Additionally, we introduce the Neural Law of Large Numbers. (5) We discuss the statistical basis for association, explanation, prediction, and causation, and recommend a strategy that may be useful to check if association-based findings can be raised to causal discoveries. Taken together, to understand the variation of data, one needs creative statistical thinking. Meanwhile, by incorporating insights learned from data, one can begin to design better statistical apparatuses.
Keywords:
Subject: Biology and Life Sciences - Neuroscience and Neurology
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.