Abstract
Dimension-reduction based regression methods reduce the predictors to a few components and predict the criterion using these components. When applying such methods, it is often not only important to achieve good prediction of the criterion, but also desirable to gain correct information about the underlying structure of the predictors (i.e., recovery of the underlying components). In contrast to PLS and PCR, PCovR explicitly aims at achieving both goals simultaneously. Moreover, the extent to which both aspects play a role in the construction of the components can be determined flexibly with a weighting parameter. This has as a downside that a dual model selection strategy is needed: selection of the number of components and selection of the weighting parameter value. Therefore, four model selection strategies are examined, and the optimality of the extracted components is studied in comparison to those resulting from PCR and PLS analyses. Based on the results of two simulation studies, we conclude that when the questions of a researcher match the optimality criteria specified in this paper, it is advised to use PCovR rather than PCR or PLS. Moreover, we recommend to use a weighting parameter that puts a lot of emphasis on the reconstruction of the predictor scores as well as to combine the results of a scree test and a cross-validation procedure when deciding on the number of components.
Keywords: Principal covariates regression, Principal component regression,
Partial least squares, Regression, Dimension reduction, Model selection
Keywords: Principal covariates regression, Principal component regression,
Partial least squares, Regression, Dimension reduction, Model selection
Original language | English |
---|---|
Pages (from-to) | 26-33 |
Journal | Chemometrics & Intelligent Laboratory Systems |
Volume | 151 |
DOIs | |
Publication status | Published - 2016 |