Multiple imputation of missing categorical data using latent class models: State of art

Research output: Contribution to journalArticleScientificpeer-review

63 Downloads (Pure)

Abstract

This paper provides an overview of recent proposals for using latent class models for the multiple imputation of missing categorical data in large-scale studies. While latent class (or finite mixture) modeling is mainly known as a clustering tool, it can also be used for density estimation, i.e., to get a good description of the lower- and higher-order associations among the variables in a dataset. For multiple imputation, the latter aspect is essential in order to be able to draw meaningful imputing values from the conditional distribution of the missing data given the observed data.
We explain the general logic underlying the use of latent class analysis for multiple imputation. Moreover, we present several variants developed within either a frequentist or a Bayesian framework, each of which overcomes certain limitations of the standard implementation. The different approaches are illustrated and compared using a real-data psychological assessment application.
Original languageEnglish
Pages (from-to)542-576
JournalPsychological Test and Assessment Modeling
Volume57
Issue number4
Publication statusPublished - 2015

Cite this

@article{cbe5c52841a9446eab7cff2cb89de327,
title = "Multiple imputation of missing categorical data using latent class models: State of art",
abstract = "This paper provides an overview of recent proposals for using latent class models for the multiple imputation of missing categorical data in large-scale studies. While latent class (or finite mixture) modeling is mainly known as a clustering tool, it can also be used for density estimation, i.e., to get a good description of the lower- and higher-order associations among the variables in a dataset. For multiple imputation, the latter aspect is essential in order to be able to draw meaningful imputing values from the conditional distribution of the missing data given the observed data.We explain the general logic underlying the use of latent class analysis for multiple imputation. Moreover, we present several variants developed within either a frequentist or a Bayesian framework, each of which overcomes certain limitations of the standard implementation. The different approaches are illustrated and compared using a real-data psychological assessment application.",
author = "D. Vidotto and M.C. Kaptein and J.K. Vermunt",
year = "2015",
language = "English",
volume = "57",
pages = "542--576",
journal = "Psychological Test and Assessment Modeling",
issn = "2190-0493",
number = "4",

}

Multiple imputation of missing categorical data using latent class models : State of art. / Vidotto, D.; Kaptein, M.C.; Vermunt, J.K.

In: Psychological Test and Assessment Modeling, Vol. 57, No. 4, 2015, p. 542-576.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Multiple imputation of missing categorical data using latent class models

T2 - State of art

AU - Vidotto, D.

AU - Kaptein, M.C.

AU - Vermunt, J.K.

PY - 2015

Y1 - 2015

N2 - This paper provides an overview of recent proposals for using latent class models for the multiple imputation of missing categorical data in large-scale studies. While latent class (or finite mixture) modeling is mainly known as a clustering tool, it can also be used for density estimation, i.e., to get a good description of the lower- and higher-order associations among the variables in a dataset. For multiple imputation, the latter aspect is essential in order to be able to draw meaningful imputing values from the conditional distribution of the missing data given the observed data.We explain the general logic underlying the use of latent class analysis for multiple imputation. Moreover, we present several variants developed within either a frequentist or a Bayesian framework, each of which overcomes certain limitations of the standard implementation. The different approaches are illustrated and compared using a real-data psychological assessment application.

AB - This paper provides an overview of recent proposals for using latent class models for the multiple imputation of missing categorical data in large-scale studies. While latent class (or finite mixture) modeling is mainly known as a clustering tool, it can also be used for density estimation, i.e., to get a good description of the lower- and higher-order associations among the variables in a dataset. For multiple imputation, the latter aspect is essential in order to be able to draw meaningful imputing values from the conditional distribution of the missing data given the observed data.We explain the general logic underlying the use of latent class analysis for multiple imputation. Moreover, we present several variants developed within either a frequentist or a Bayesian framework, each of which overcomes certain limitations of the standard implementation. The different approaches are illustrated and compared using a real-data psychological assessment application.

M3 - Article

VL - 57

SP - 542

EP - 576

JO - Psychological Test and Assessment Modeling

JF - Psychological Test and Assessment Modeling

SN - 2190-0493

IS - 4

ER -