Multiple imputation of longitudinal categorical data through bayesian mixture latent Markov models

Davide Vidotto*, Jeroen Vermunt, Katrijn Van Deun

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

1 Downloads (Pure)

Abstract

Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data.
Original languageEnglish
JournalJournal of Applied Statistics
DOIs
Publication statusPublished - 2019

Fingerprint

Multiple Imputation
Nominal or categorical data
Longitudinal Data
Markov Model
Time Constant
Categorical
Latent Class
Latent Class Model
Longitudinal Study
Dependence Structure
Model Analysis
Covariates
Time-varying
Adjacent
Simulation Study
Categorical data
Markov model
Multiple imputation
Modeling
Estimate

Keywords

  • Bayesian mixture latent Markov models
  • POSTERIOR DISTRIBUTIONS
  • longitudinal analysis
  • missing data
  • multiple imputation

Cite this

@article{dac7e3a0a7c44eeb830482e25f66cfdb,
title = "Multiple imputation of longitudinal categorical data through bayesian mixture latent Markov models",
abstract = "Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data.",
keywords = "Bayesian mixture latent Markov models, POSTERIOR DISTRIBUTIONS, longitudinal analysis, missing data, multiple imputation",
author = "Davide Vidotto and Jeroen Vermunt and {Van Deun}, Katrijn",
year = "2019",
doi = "10.1080/02664763.2019.1692794",
language = "English",
journal = "Journal of Applied Statistics",
issn = "0266-4763",
publisher = "Routledge",

}

TY - JOUR

T1 - Multiple imputation of longitudinal categorical data through bayesian mixture latent Markov models

AU - Vidotto, Davide

AU - Vermunt, Jeroen

AU - Van Deun, Katrijn

PY - 2019

Y1 - 2019

N2 - Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data.

AB - Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data.

KW - Bayesian mixture latent Markov models

KW - POSTERIOR DISTRIBUTIONS

KW - longitudinal analysis

KW - missing data

KW - multiple imputation

UR - https://app-eu.readspeaker.com/cgi-bin/rsent?customerid=10118&lang=en_us&readclass=rs_readArea&url=https%3A%2F%2Fwww.tandfonline.com%2Fdoi%2Ffull%2F10.1080%2F02664763.2019.1692794

U2 - 10.1080/02664763.2019.1692794

DO - 10.1080/02664763.2019.1692794

M3 - Article

JO - Journal of Applied Statistics

JF - Journal of Applied Statistics

SN - 0266-4763

ER -