TY - JOUR

T1 - On the selection of the weighting parameter value in Principal Covariates Regression

AU - Vervloet, Marlies

AU - Van Deun, K.

AU - Van Den Noortgate, Wim

AU - Ceulemans, Eva

PY - 2013

Y1 - 2013

N2 - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.

AB - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.

U2 - 10.1016/j.chemolab.2013.02.005

DO - 10.1016/j.chemolab.2013.02.005

M3 - Article

SN - 0169-7439

VL - 123

SP - 36

EP - 43

JO - Chemometrics & Intelligent Laboratory Systems

JF - Chemometrics & Intelligent Laboratory Systems

ER -