TY - JOUR
T1 - Scale length does matter
T2 - Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches
AU - D'Urso, E. Damiano
AU - Roover, Kim De
AU - Vermunt, Jeroen K.
AU - Tijmstra, Jesper
PY - 2022
Y1 - 2022
N2 - In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance (MI) holds across the groups. This study compared the performance of scale- and item-level approaches based on multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing MI with ordinal data. In general, the results of the simulation studies showed that, MG-CCFA-based approaches outperformed MG-IRT-based approaches when testing MI at the scale level, whereas, at the item level, the best performing approach depends on the tested parameter (i.e., loadings or thresholds). That is, when testing loadings equivalence, the likelihood ratio test provided the best trade-off between true positive rate and false positve rate, whereas, when testing thresholds equivalence, the chi-square test outperformed the other testing strategies. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually.
AB - In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance (MI) holds across the groups. This study compared the performance of scale- and item-level approaches based on multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing MI with ordinal data. In general, the results of the simulation studies showed that, MG-CCFA-based approaches outperformed MG-IRT-based approaches when testing MI at the scale level, whereas, at the item level, the best performing approach depends on the tested parameter (i.e., loadings or thresholds). That is, when testing loadings equivalence, the likelihood ratio test provided the best trade-off between true positive rate and false positve rate, whereas, when testing thresholds equivalence, the chi-square test outperformed the other testing strategies. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually.
KW - CFA (confirmatory factor analysis)
KW - CONFIRMATORY FACTOR-ANALYSIS
KW - Categorical data
KW - DIF (differential item functioning)
KW - IDENTIFICATION
KW - IRT (item response theory)
KW - LOGISTIC-REGRESSION
KW - MANTEL-HAENSZEL
KW - MODEL
KW - Measurement invariance
KW - OF-FIT INDEXES
KW - R PACKAGE
UR - http://www.scopus.com/inward/record.url?scp=85121337489&partnerID=8YFLogxK
U2 - 10.3758/s13428-021-01690-7
DO - 10.3758/s13428-021-01690-7
M3 - Article
C2 - 34910286
SN - 1554-351X
VL - 54
SP - 2114
EP - 2145
JO - Behavior Research Methods
JF - Behavior Research Methods
IS - 5
ER -