Article
Version 1
Preserved in Portico This version is not peer-reviewed
LASSO and Elastic Net Tend to Over-Select Features
Version 1
: Received: 1 August 2023 / Approved: 2 August 2023 / Online: 3 August 2023 (14:22:22 CEST)
A peer-reviewed article of this Preprint also exists.
Liu, L.; Gao, J.; Beasley, G.; Jung, S.-H. LASSO and Elastic Net Tend to Over-Select Features. Mathematics 2023, 11, 3738. Liu, L.; Gao, J.; Beasley, G.; Jung, S.-H. LASSO and Elastic Net Tend to Over-Select Features. Mathematics 2023, 11, 3738.
Abstract
Machine learning methods have been a standard approach to select features that are associated with an outcome and build a prediction model when the number of candidate features is large. LASSO has been one of the most popular approaches to this end. LASSO approach selects features with large regression estimates, rather than based on statistical significance, associating the outcome, by imposing L1-norm penalty to overcome the high dimensionality of the candidate features. As a result, LASSO may select insignificant features while possibly missing significant ones. Furthermore, from our experience, LASSO has been found to select too many features. By selecting features that are not associated with the outcome, we may have to spend more cost to collect and manage them in the future use of a fitted prediction model. Using the combination of L1- and L2-norm penalties, elastic net (EN) tends to select more features than LASSO. The overly selected features that are not associated with the outcome act like white noise, so that the fitted prediction model loses the prediction accuracy. In this paper, we propose to use the standard regression methods (without any penalizing approach) with stepwise variable selection procedure to overcome these issues. Unlike LASSO and EN, this method selects features based on statistical significance. Through extensive simulations, we show that this maximum likelihood estimation based method selects very small number of features while maintaining a high prediction power, while LASSO and EN make a large number of false selections to result in loss of prediction accuracy. Contrary to LASSO and EN, the regression methods combined with a stepwise variable selection method is a standard statistical method, so that any biostatistician can use it to analyze high dimensional data even without advanced bioinformatics knowledge.
Keywords
Logistic regression; Machine learning; Prediction model; ROC curve; Variable selection
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment