Having large sets of predictors from multiple sources concerning the same observation units and the same criterion is becoming increasingly common in chemometrics. When analyzing such data, chemometricians often have multiple objectives: prediction of the criterion, variable selection, and identification of underlying processes associated to individual predictor sources or to several sources jointly. Existing methods offer solutions regarding the first two aims of uncovering the predictive mechanisms and relevant variables therein for a single block of predictor variables, but the challenge of uncovering joint and distinctive predictive mechanisms and the relevant variables therein in the multisource setting still needs to be addressed. To this end, we present a multiblock extension of principal covariates regression that aims to find the complex mechanisms in which several or single sources may be involved; taken together, these mechanisms predict an outcome of interest. We call this method sparse common and distinctive covariates regression (SCD‐CovR). Through a simulation study, we demonstrate that SCD‐CovR provides competitive solutions when compared with related methods. The method is also illustrated via an application to a publicly available dataset.
- VARIABLE SELECTION
- common and distinctive processes
- data integration
- multiblock data
- principal covariates regression
- variable selection