Model selection in principal covariates regression

M. Vervloet, K. Van Deun, W. van den Noortgate, E. Ceulemans

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Dimension-reduction based regression methods reduce the predictors to a few components and predict the criterion using these components. When applying such methods, it is often not only important to achieve good prediction of the criterion, but also desirable to gain correct information about the underlying structure of the predictors (i.e., recovery of the underlying components). In contrast to PLS and PCR, PCovR explicitly aims at achieving both goals simultaneously. Moreover, the extent to which both aspects play a role in the construction of the components can be determined flexibly with a weighting parameter. This has as a downside that a dual model selection strategy is needed: selection of the number of components and selection of the weighting parameter value. Therefore, four model selection strategies are examined, and the optimality of the extracted components is studied in comparison to those resulting from PCR and PLS analyses. Based on the results of two simulation studies, we conclude that when the questions of a researcher match the optimality criteria specified in this paper, it is advised to use PCovR rather than PCR or PLS. Moreover, we recommend to use a weighting parameter that puts a lot of emphasis on the reconstruction of the predictor scores as well as to combine the results of a scree test and a cross-validation procedure when deciding on the number of components.
Keywords: Principal covariates regression, Principal component regression,
Partial least squares, Regression, Dimension reduction, Model selection
Original languageEnglish
Pages (from-to)26-33
JournalChemometrics & Intelligent Laboratory Systems
Volume151
DOIs
Publication statusPublished - 2016

Fingerprint

Recovery

Cite this

Vervloet, M. ; Van Deun, K. ; van den Noortgate, W. ; Ceulemans, E. / Model selection in principal covariates regression. In: Chemometrics & Intelligent Laboratory Systems. 2016 ; Vol. 151. pp. 26-33.
@article{462cadc994174c559f297d715aeab149,
title = "Model selection in principal covariates regression",
abstract = "Dimension-reduction based regression methods reduce the predictors to a few components and predict the criterion using these components. When applying such methods, it is often not only important to achieve good prediction of the criterion, but also desirable to gain correct information about the underlying structure of the predictors (i.e., recovery of the underlying components). In contrast to PLS and PCR, PCovR explicitly aims at achieving both goals simultaneously. Moreover, the extent to which both aspects play a role in the construction of the components can be determined flexibly with a weighting parameter. This has as a downside that a dual model selection strategy is needed: selection of the number of components and selection of the weighting parameter value. Therefore, four model selection strategies are examined, and the optimality of the extracted components is studied in comparison to those resulting from PCR and PLS analyses. Based on the results of two simulation studies, we conclude that when the questions of a researcher match the optimality criteria specified in this paper, it is advised to use PCovR rather than PCR or PLS. Moreover, we recommend to use a weighting parameter that puts a lot of emphasis on the reconstruction of the predictor scores as well as to combine the results of a scree test and a cross-validation procedure when deciding on the number of components.Keywords: Principal covariates regression, Principal component regression, Partial least squares, Regression, Dimension reduction, Model selection",
author = "M. Vervloet and {Van Deun}, K. and {van den Noortgate}, W. and E. Ceulemans",
year = "2016",
doi = "10.1016/j.chemolab.2015.12.004",
language = "English",
volume = "151",
pages = "26--33",
journal = "Chemometrics & Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier Science BV",

}

Model selection in principal covariates regression. / Vervloet, M.; Van Deun, K.; van den Noortgate, W.; Ceulemans, E.

In: Chemometrics & Intelligent Laboratory Systems, Vol. 151, 2016, p. 26-33.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Model selection in principal covariates regression

AU - Vervloet, M.

AU - Van Deun, K.

AU - van den Noortgate, W.

AU - Ceulemans, E.

PY - 2016

Y1 - 2016

N2 - Dimension-reduction based regression methods reduce the predictors to a few components and predict the criterion using these components. When applying such methods, it is often not only important to achieve good prediction of the criterion, but also desirable to gain correct information about the underlying structure of the predictors (i.e., recovery of the underlying components). In contrast to PLS and PCR, PCovR explicitly aims at achieving both goals simultaneously. Moreover, the extent to which both aspects play a role in the construction of the components can be determined flexibly with a weighting parameter. This has as a downside that a dual model selection strategy is needed: selection of the number of components and selection of the weighting parameter value. Therefore, four model selection strategies are examined, and the optimality of the extracted components is studied in comparison to those resulting from PCR and PLS analyses. Based on the results of two simulation studies, we conclude that when the questions of a researcher match the optimality criteria specified in this paper, it is advised to use PCovR rather than PCR or PLS. Moreover, we recommend to use a weighting parameter that puts a lot of emphasis on the reconstruction of the predictor scores as well as to combine the results of a scree test and a cross-validation procedure when deciding on the number of components.Keywords: Principal covariates regression, Principal component regression, Partial least squares, Regression, Dimension reduction, Model selection

AB - Dimension-reduction based regression methods reduce the predictors to a few components and predict the criterion using these components. When applying such methods, it is often not only important to achieve good prediction of the criterion, but also desirable to gain correct information about the underlying structure of the predictors (i.e., recovery of the underlying components). In contrast to PLS and PCR, PCovR explicitly aims at achieving both goals simultaneously. Moreover, the extent to which both aspects play a role in the construction of the components can be determined flexibly with a weighting parameter. This has as a downside that a dual model selection strategy is needed: selection of the number of components and selection of the weighting parameter value. Therefore, four model selection strategies are examined, and the optimality of the extracted components is studied in comparison to those resulting from PCR and PLS analyses. Based on the results of two simulation studies, we conclude that when the questions of a researcher match the optimality criteria specified in this paper, it is advised to use PCovR rather than PCR or PLS. Moreover, we recommend to use a weighting parameter that puts a lot of emphasis on the reconstruction of the predictor scores as well as to combine the results of a scree test and a cross-validation procedure when deciding on the number of components.Keywords: Principal covariates regression, Principal component regression, Partial least squares, Regression, Dimension reduction, Model selection

U2 - 10.1016/j.chemolab.2015.12.004

DO - 10.1016/j.chemolab.2015.12.004

M3 - Article

VL - 151

SP - 26

EP - 33

JO - Chemometrics & Intelligent Laboratory Systems

JF - Chemometrics & Intelligent Laboratory Systems

SN - 0169-7439

ER -