Item Response Theory with Covariates (IRT-C)

Assessing item recovery and differential item functioning for the three-parameter logistic model

L. Tay, Q. Huang, J.K. Vermunt

Research output: Contribution to journalArticleScientificpeer-review

Abstract

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.

Original languageEnglish
Pages (from-to)22-42
Number of pages21
JournalEducational and Psychological Measurement
Volume76
Issue number1
DOIs
Publication statusPublished - 2016

Keywords

  • differential item functioning
  • item response theory
  • simulation
  • covariates
  • MEASUREMENT EQUIVALENCE
  • DIF

Cite this

@article{43facdd3c83f4509a71e75e502efddaa,
title = "Item Response Theory with Covariates (IRT-C): Assessing item recovery and differential item functioning for the three-parameter logistic model",
abstract = "In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.",
keywords = "differential item functioning, item response theory, simulation, covariates, MEASUREMENT EQUIVALENCE, DIF",
author = "L. Tay and Q. Huang and J.K. Vermunt",
year = "2016",
doi = "10.1177/0013164415579488",
language = "English",
volume = "76",
pages = "22--42",
journal = "Educational and Psychological Measurement",
issn = "0013-1644",
publisher = "Sage Publications, Inc.",
number = "1",

}

Item Response Theory with Covariates (IRT-C) : Assessing item recovery and differential item functioning for the three-parameter logistic model. / Tay, L.; Huang, Q.; Vermunt, J.K.

In: Educational and Psychological Measurement, Vol. 76, No. 1, 2016, p. 22-42.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Item Response Theory with Covariates (IRT-C)

T2 - Assessing item recovery and differential item functioning for the three-parameter logistic model

AU - Tay, L.

AU - Huang, Q.

AU - Vermunt, J.K.

PY - 2016

Y1 - 2016

N2 - In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.

AB - In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.

KW - differential item functioning

KW - item response theory

KW - simulation

KW - covariates

KW - MEASUREMENT EQUIVALENCE

KW - DIF

U2 - 10.1177/0013164415579488

DO - 10.1177/0013164415579488

M3 - Article

VL - 76

SP - 22

EP - 42

JO - Educational and Psychological Measurement

JF - Educational and Psychological Measurement

SN - 0013-1644

IS - 1

ER -