Unreliable heterogeneity: how measurement errorobscures heterogeneity in meta-analyses inpsychology

A. Olsson Collentine*, M. Bakker, J. Wicherts

*Corresponding author for this work

Research output: Working paperScientific

65 Downloads (Pure)

Abstract

Measurement error (imperfect reliability) is present in any empirical effect size estimate and system-atically attenuates observed effect sizes compared to true underlying effect sizes. Yet there exist broad concerns thatproper measurement tends to be neglected in much of psychological research. We examined how measurement errorin primary studies affects meta-analytic heterogeneity estimates using Monte-Carlo simulations. Our results indicatethat although measurement error in primary studies can both inflate and suppress heterogeneity, under most circum-stances measurement error in primary studies leads to a severe underestimate of heterogeneity in meta-analysis. Oursimulations showed expected heterogeneity to be underestimated by about 15% - 60% when considering a typicaleffect size around r = 0.2 and true heterogeneity levels that are common in the meta-analytic literature ( >0.1, inPearson’s r). The underestimate primarily depends on average reliability in primary studies (higher reliability leadsto a smaller underestimate), but also worsens with smaller primary study sample sizes. We observed a positive biasin heterogeneity estimates due to measurement error only under specific and arguably uncommon circumstancesof (1) actual zero heterogeneity, particularly when mean effect sizes are large, or (2) combinations of very smalltrue heterogeneity, large variance in primary study reliabilities, large mean effect sizes, and a limited number ofprimary studies. Severe underestimates of heterogeneity due to measurement error may affect many meta-analysesin psychology and obscure true differences between studies that could be relevant for theory, practice, and futureresearch efforts. Research on concrete guidance to applied meta-analysts is needed, as sophisticated methods for cor-recting measurement unreliability such as meta-analytic structural equation modeling (MASEM) are only applicablein exceptional cases and corrections based on classical test theory come with caveats and strong assumptions.
Original languageEnglish
PublisherOSF Preprints
Number of pages18
DOIs
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'Unreliable heterogeneity: how measurement errorobscures heterogeneity in meta-analyses inpsychology'. Together they form a unique fingerprint.

Cite this