Clusterwise Simultaneous Component Analysis for Analyzing Structural Differences in Multivariate Multiblock Data

Kim De Roover*, Eva Ceulemans, Marieke E. Timmerman, Kristof Vansteelandt, Jeroen Stouten, Patrick Onghena

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

24 Downloads (Pure)

Abstract

Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.

Original languageEnglish
Pages (from-to)100-119
Number of pages20
JournalPsychological Methods
Volume17
Issue number1
DOIs
Publication statusPublished - Mar 2012
Externally publishedYes

Keywords

  • multigroup data
  • multilevel data
  • principal component analysis
  • simultaneous component analysis
  • clustering
  • PHYSICAL-ACTIVITY
  • LINEAR-REGRESSION
  • ANOREXIA-NERVOSA
  • EATING-DISORDERS
  • SOCIAL DILEMMAS
  • EMOTIONAL-REACTIONS
  • FINITE MIXTURES
  • LATENT CLASS
  • FACTOR MODEL
  • MONTE-CARLO

Cite this

De Roover, Kim ; Ceulemans, Eva ; Timmerman, Marieke E. ; Vansteelandt, Kristof ; Stouten, Jeroen ; Onghena, Patrick. / Clusterwise Simultaneous Component Analysis for Analyzing Structural Differences in Multivariate Multiblock Data. In: Psychological Methods. 2012 ; Vol. 17, No. 1. pp. 100-119.
@article{95091833c60d409bbf66d0235f982505,
title = "Clusterwise Simultaneous Component Analysis for Analyzing Structural Differences in Multivariate Multiblock Data",
abstract = "Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.",
keywords = "multigroup data, multilevel data, principal component analysis, simultaneous component analysis, clustering, PHYSICAL-ACTIVITY, LINEAR-REGRESSION, ANOREXIA-NERVOSA, EATING-DISORDERS, SOCIAL DILEMMAS, EMOTIONAL-REACTIONS, FINITE MIXTURES, LATENT CLASS, FACTOR MODEL, MONTE-CARLO",
author = "{De Roover}, Kim and Eva Ceulemans and Timmerman, {Marieke E.} and Kristof Vansteelandt and Jeroen Stouten and Patrick Onghena",
year = "2012",
month = "3",
doi = "10.1037/a0025385",
language = "English",
volume = "17",
pages = "100--119",
journal = "Psychological Methods",
issn = "1082-989X",
publisher = "AMER PSYCHOLOGICAL ASSOC",
number = "1",

}

Clusterwise Simultaneous Component Analysis for Analyzing Structural Differences in Multivariate Multiblock Data. / De Roover, Kim; Ceulemans, Eva; Timmerman, Marieke E.; Vansteelandt, Kristof; Stouten, Jeroen; Onghena, Patrick.

In: Psychological Methods, Vol. 17, No. 1, 03.2012, p. 100-119.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Clusterwise Simultaneous Component Analysis for Analyzing Structural Differences in Multivariate Multiblock Data

AU - De Roover, Kim

AU - Ceulemans, Eva

AU - Timmerman, Marieke E.

AU - Vansteelandt, Kristof

AU - Stouten, Jeroen

AU - Onghena, Patrick

PY - 2012/3

Y1 - 2012/3

N2 - Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.

AB - Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.

KW - multigroup data

KW - multilevel data

KW - principal component analysis

KW - simultaneous component analysis

KW - clustering

KW - PHYSICAL-ACTIVITY

KW - LINEAR-REGRESSION

KW - ANOREXIA-NERVOSA

KW - EATING-DISORDERS

KW - SOCIAL DILEMMAS

KW - EMOTIONAL-REACTIONS

KW - FINITE MIXTURES

KW - LATENT CLASS

KW - FACTOR MODEL

KW - MONTE-CARLO

U2 - 10.1037/a0025385

DO - 10.1037/a0025385

M3 - Article

VL - 17

SP - 100

EP - 119

JO - Psychological Methods

JF - Psychological Methods

SN - 1082-989X

IS - 1

ER -