Logistic regression with sparse common and distinctive covariates

S. Park, E. Ceulemans, K. Van Deun

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)
70 Downloads (Pure)

Abstract

Having large sets of predictor variables from multiple sources concerning the same individuals is becoming increasingly common in behavioral research. On top of the variable selection problem, predicting a categorical outcome using such data gives rise to an additional challenge of identifying the processes at play underneath the predictors. These processes are of particular interest in the setting of multi-source data because they can either be associated individually with a single data source or jointly with multiple sources. Although many methods have addressed the classification problem in high dimensionality, the additional challenge of distinguishing such underlying predictor processes from multi-source data has not received sufficient attention. To this end, we propose the method of Sparse Common and Distinctive Covariates Logistic Regression (SCD-Cov-logR). The method is a multi-source extension of principal covariates regression that combines with generalized linear modeling framework to allow classification of a categorical outcome. In a simulation study, SCD-Cov-logR resulted in outperformance compared to related methods commonly used in behavioral sciences. We also demonstrate the practical usage of the method under an empirical dataset.
Original languageEnglish
Pages (from-to)4143-4174
JournalBehavior Research Methods
Volume55
Issue number8
DOIs
Publication statusPublished - 2023

Keywords

  • Classification
  • Common and distinctive processes
  • Data integration
  • Logistic regression
  • Multiblock data
  • Principal covariates regression
  • Computer Simulation
  • Humans
  • Linear Models
  • Logistic Models

Fingerprint

Dive into the research topics of 'Logistic regression with sparse common and distinctive covariates'. Together they form a unique fingerprint.

Cite this