Detecting outlying variables in multigroup data: A comparison of different loading similarity coefficients

Sopiko Gvaladze*, Kim De Roover, Francis Tuerlinckx, Eva Ceulemans

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)

Abstract

Multivariate multigroup data are collected in many fields of science, where the so-called groups pertain to, for instance, experimental groups or countries the participants are nested in. To summarize the main information in such data, principal component analysis (PCA) is highly popular. PCA reduces the variables to a few components that are linear combinations of the original variables. Researchers usually assume those components to be the same across the groups and aim to apply a simultaneous component analysis. To investigate whether this assumption is reasonable, one often analyzes the groups separately and computes a similarity index between the group-specific component loadings of the variables. In many cases, however, most variables have highly similar loadings across the groups, but a few variables, which we will call "outlying variables," behave differently, indicating that a simultaneous analysis is not warranted. In such cases, the outlying variables should be removed before proceeding with the simultaneous analysis. To do so, the variables are ranked according to their relative outlyingness. Although some procedures have been proposed that yield such an outlyingness ranking, they might not be optimal, because they all rely on the same choice of similarity coefficient without evaluating other alternatives. In this paper, we give an overview of other options and report extensive simulations in which we investigate how this choice affects the correctness of the outlyingness ranking. We also illustrate the added value of the outlying variable approach by means of sensometric data on different bread samples.

Original languageEnglish
Article numbere3233
Number of pages20
JournalJournal of Chemometrics
Volume35
Issue number2
DOIs
Publication statusPublished - 2021

Keywords

  • component similarity
  • multivariate multigroup data
  • PCA
  • SCA
  • Tucker's congruence
  • SIMULTANEOUS COMPONENT ANALYSIS
  • CONGRUENCE COEFFICIENTS
  • ROTATION
  • MULTIBLOCK
  • MODELS
  • NUMBER
  • COMMON

Cite this