TY - JOUR
T1 - On the selection of the weighting parameter value in Principal Covariates Regression
AU - Vervloet, Marlies
AU - Van Deun, K.
AU - Van Den Noortgate, Wim
AU - Ceulemans, Eva
PY - 2013
Y1 - 2013
N2 - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.
AB - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.
U2 - 10.1016/j.chemolab.2013.02.005
DO - 10.1016/j.chemolab.2013.02.005
M3 - Article
SN - 0169-7439
VL - 123
SP - 36
EP - 43
JO - Chemometrics & Intelligent Laboratory Systems
JF - Chemometrics & Intelligent Laboratory Systems
ER -