In the current era of ‘big data’, scientists are able to quickly amass enormous amount of data in a limited number of experiments. The investigators then try to hypothesize about the root cause based on the observed trends for the predictors and the response variable. This involves identifying the discriminatory predictors that are most responsible for explaining variation in the response variable. In the current work, we investigated three related multivariate techniques: Principal Component Regression (PCR), Partial Least Squares or Projections to Latent Structures (PLS), and Orthogonal Partial Least Squares (OPLS). To perform a comparative analysis, we used a publicly available dataset for Parkinson’ disease patien ts. We first performed the analysis using a cross-validated number of principal components for the aforementioned techniques. Our results demonstrated that PLS and OPLS were better suited than PCR for identifying the discriminatory predictors. Since the X data did not exhibit a strong correlation, we also performed Multiple Linear Regression (MLR) on the dataset. A comparison of the top five discriminatory predictors identified by the four techniques showed a substantial overlap between the results obtained by PLS, OPLS, and MLR, and the three techniques exhibited a significant divergence from the variables identified by PCR. A further investigation of the data revealed that PCR could be used to identify the discriminatory variables successfully if the number of principal components in the regression model were increased. In summary, we recommend using PLS or OPLS for hypothesis generation and systemizing the selection process for principal components when using PCR.rewordexplain later why MLR can be used on a dataset with no correlation
Keywords:
Subject: Computer Science and Mathematics - Applied Mathematics
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.