On the selection of the weighting parameter value in Principal Covariates Regression

Marlies Vervloet, K. Van Deun, Wim Van Den Noortgate, Eva Ceulemans

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.
Original languageEnglish
Pages (from-to)36-43
JournalChemometrics & Intelligent Laboratory Systems
Volume123
DOIs
Publication statusPublished - 2013
Externally publishedYes

Fingerprint

Principal component analysis
Linear regression
Tuning
Recovery

Cite this

Vervloet, Marlies ; Van Deun, K. ; Van Den Noortgate, Wim ; Ceulemans, Eva. / On the selection of the weighting parameter value in Principal Covariates Regression. In: Chemometrics & Intelligent Laboratory Systems. 2013 ; Vol. 123. pp. 36-43.
@article{91871608078647d9be7c1a3a89f7fa86,
title = "On the selection of the weighting parameter value in Principal Covariates Regression",
abstract = "Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.",
author = "Marlies Vervloet and {Van Deun}, K. and {Van Den Noortgate}, Wim and Eva Ceulemans",
year = "2013",
doi = "10.1016/j.chemolab.2013.02.005",
language = "English",
volume = "123",
pages = "36--43",
journal = "Chemometrics & Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier Science BV",

}

On the selection of the weighting parameter value in Principal Covariates Regression. / Vervloet, Marlies; Van Deun, K.; Van Den Noortgate, Wim; Ceulemans, Eva.

In: Chemometrics & Intelligent Laboratory Systems, Vol. 123, 2013, p. 36-43.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - On the selection of the weighting parameter value in Principal Covariates Regression

AU - Vervloet, Marlies

AU - Van Deun, K.

AU - Van Den Noortgate, Wim

AU - Ceulemans, Eva

PY - 2013

Y1 - 2013

N2 - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.

AB - Ordinary linear regression falls short when many predictors are available, especially when some of these are highly correlated with (a linear combination of) other predictors. One possible solution for this problem is Principal Covariates Regression (PCovR) which combines the main ideas behind Principal Component Analysis (PCA) and regression. Like PCA, PCovR reduces the predictors to a few components and, like regression, it predicts the criterion, but using the components as predictors. The reduction of the predictors and the prediction of the criterion are conducted simultaneously, by minimizing the weighted sum of the reduction error and the prediction error. How the value of the weighting parameter α can be optimally tuned, is not so obvious however. In this paper we integrate scattered findings on this topic and derive some hypotheses on which α value is optimal in which respect and on the importance of tuning the value (how robust are the obtained results for the value that is chosen for α?). We put these hypotheses to the test by performing an extensive simulation study. As predicted, the α value that optimizes recovery of the underlying parameters and true criterion scores depends among others on the number of predictors in a specific dataset and on the ratio of the amount of error on the predictors and the amount of error on the criterion. Moreover, we show that α is mostly of influence when the components strongly differ in strength and relevance, when the number of observations almost equals the number of predictor variables, or when the criterion contains a moderate to high amount of error.

U2 - 10.1016/j.chemolab.2013.02.005

DO - 10.1016/j.chemolab.2013.02.005

M3 - Article

VL - 123

SP - 36

EP - 43

JO - Chemometrics & Intelligent Laboratory Systems

JF - Chemometrics & Intelligent Laboratory Systems

SN - 0169-7439

ER -