Benefits of dimension reduction in penalized regression methods for high dimensional grouped data: a case study in low sample size


Motivation: In some prediction analyses, predictors have a natural grouping structure and selecting predictors accounting for this additional information could be more effective for predicting the outcome accurately. Moreover, in a high dimension low sample size (HDLSS) framework, obtaining a good predictive model becomes very challenging. The objective of this work was to investigate the benefits of dimension reduction in penalized regression methods, in terms of prediction performance and variable selection consistency, in HDLSS data. Using a real dataset, we compared the performances of lasso, elastic net, group lasso (gLasso), sparse group lasso (sgLasso), sparse partial least squares (sPLS), group partial least squares (gPLS) and sparse group partial least squares (sgPLS). Results: Considering dimension reduction in penalized regression methods improved the prediction accuracy. The sgPLS reached the lowest prediction error while consistently selecting a few predictors from a single group.

Bioinformatics, in press