In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.
- differential item functioning
- item response theory
- MEASUREMENT EQUIVALENCE