Abstract
Both registers and surveys can contain classication errors. These errors can be estimated by making use of information that is obtained when making use of a combined dataset. We propose a new method based on latent class modelling that estimates the number of classification errors in the multiple sources,
and simultaneously takes impossible combinations with other variables into account. Furthermore, we use the latent class model to multiply impute a new variable, which enhances the quality of statistics based on the combined dataset. The performance of this method is investigated by a simulation study, which shows that whether the method can be applied depends on the entropy of the LC model and the type of analysis a researcher is planning to do. Furthermore, the method is applied to a combined dataset from Statistics Netherlands.
and simultaneously takes impossible combinations with other variables into account. Furthermore, we use the latent class model to multiply impute a new variable, which enhances the quality of statistics based on the combined dataset. The performance of this method is investigated by a simulation study, which shows that whether the method can be applied depends on the entropy of the LC model and the type of analysis a researcher is planning to do. Furthermore, the method is applied to a combined dataset from Statistics Netherlands.
Original language | English |
---|---|
Pages (from-to) | 921–962 |
Journal | Journal of Official Statistics |
Volume | 33 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2017 |