Bayesian multilevel latent class models for the multiple imputation of nested categorical data

Davide Vidotto*, Jeroen K. Vermunt, Katrijn van Deun

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

4 Downloads (Pure)

Abstract

With this article, we propose using a Bayesian multilevel latent class (BMLC; or mixture) model for the multiple imputation of nested categorical data. Unlike recently developed methods that can only pick up associations between pairs of variables, the multilevel mixture model we propose is flexible enough to automatically deal with complex interactions in the joint distribution of the variables to be estimated. After formally introducing the model and showing how it can be implemented, we carry out a simulation study and a real-data study in order to assess its performance and compare it with the commonly used listwise deletion and an available R-routine. Results indicate that the BMLC model is able to recover unbiased parameter estimates of the analysis models considered in our studies, as well as to correctly reflect the uncertainty due to missing data, outperforming the competing methods.

Original languageEnglish
Pages (from-to)511-539
JournalJournal of Educational and Behavioral Statistics
Volume43
Issue number5
DOIs
Publication statusPublished - 2018

Keywords

  • Bayesian mixture models
  • latent class models
  • missing data
  • multilevel analysis
  • multiple imputation
  • MISSING DATA

Cite this

@article{c679a09cef9e4c128c041a762f0d4065,
title = "Bayesian multilevel latent class models for the multiple imputation of nested categorical data",
abstract = "With this article, we propose using a Bayesian multilevel latent class (BMLC; or mixture) model for the multiple imputation of nested categorical data. Unlike recently developed methods that can only pick up associations between pairs of variables, the multilevel mixture model we propose is flexible enough to automatically deal with complex interactions in the joint distribution of the variables to be estimated. After formally introducing the model and showing how it can be implemented, we carry out a simulation study and a real-data study in order to assess its performance and compare it with the commonly used listwise deletion and an available R-routine. Results indicate that the BMLC model is able to recover unbiased parameter estimates of the analysis models considered in our studies, as well as to correctly reflect the uncertainty due to missing data, outperforming the competing methods.",
keywords = "Bayesian mixture models, latent class models, missing data, multilevel analysis, multiple imputation, MISSING DATA",
author = "Davide Vidotto and Vermunt, {Jeroen K.} and {van Deun}, Katrijn",
year = "2018",
doi = "10.3102/1076998618769871",
language = "English",
volume = "43",
pages = "511--539",
journal = "Journal of Educational and Behavioral Statistics",
issn = "1076-9986",
publisher = "Sage Publications, Inc.",
number = "5",

}

TY - JOUR

T1 - Bayesian multilevel latent class models for the multiple imputation of nested categorical data

AU - Vidotto, Davide

AU - Vermunt, Jeroen K.

AU - van Deun, Katrijn

PY - 2018

Y1 - 2018

N2 - With this article, we propose using a Bayesian multilevel latent class (BMLC; or mixture) model for the multiple imputation of nested categorical data. Unlike recently developed methods that can only pick up associations between pairs of variables, the multilevel mixture model we propose is flexible enough to automatically deal with complex interactions in the joint distribution of the variables to be estimated. After formally introducing the model and showing how it can be implemented, we carry out a simulation study and a real-data study in order to assess its performance and compare it with the commonly used listwise deletion and an available R-routine. Results indicate that the BMLC model is able to recover unbiased parameter estimates of the analysis models considered in our studies, as well as to correctly reflect the uncertainty due to missing data, outperforming the competing methods.

AB - With this article, we propose using a Bayesian multilevel latent class (BMLC; or mixture) model for the multiple imputation of nested categorical data. Unlike recently developed methods that can only pick up associations between pairs of variables, the multilevel mixture model we propose is flexible enough to automatically deal with complex interactions in the joint distribution of the variables to be estimated. After formally introducing the model and showing how it can be implemented, we carry out a simulation study and a real-data study in order to assess its performance and compare it with the commonly used listwise deletion and an available R-routine. Results indicate that the BMLC model is able to recover unbiased parameter estimates of the analysis models considered in our studies, as well as to correctly reflect the uncertainty due to missing data, outperforming the competing methods.

KW - Bayesian mixture models

KW - latent class models

KW - missing data

KW - multilevel analysis

KW - multiple imputation

KW - MISSING DATA

U2 - 10.3102/1076998618769871

DO - 10.3102/1076998618769871

M3 - Article

VL - 43

SP - 511

EP - 539

JO - Journal of Educational and Behavioral Statistics

JF - Journal of Educational and Behavioral Statistics

SN - 1076-9986

IS - 5

ER -