Abstract
Both registers and surveys can contain classification errors. These errors can be
estimated by making use of information that is obtained when making use of a
combined dataset. We propose a new method based on latent class modelling that estimates the number of classification errors in the multiple sources, and
simultaneously takes impossible combinations with other variables into account.
Furthermore, we use the latent class model to multiply impute a new variable, which enhances the quality of statistics based on the combined dataset. The performance of this method is investigated by a simulation study, which shows that whether the method can be applied depends on the entropy R2 of the LC model and the type of analysis a researcher is planning to do. Furthermore, the method is applied to a combined dataset from Statistics Netherlands.
estimated by making use of information that is obtained when making use of a
combined dataset. We propose a new method based on latent class modelling that estimates the number of classification errors in the multiple sources, and
simultaneously takes impossible combinations with other variables into account.
Furthermore, we use the latent class model to multiply impute a new variable, which enhances the quality of statistics based on the combined dataset. The performance of this method is investigated by a simulation study, which shows that whether the method can be applied depends on the entropy R2 of the LC model and the type of analysis a researcher is planning to do. Furthermore, the method is applied to a combined dataset from Statistics Netherlands.
Original language | English |
---|---|
Publisher | Statistics Netherlands |
Number of pages | 24 |
Publication status | Published - 2016 |
Publication series
Name | CBS Discussion Paper |
---|---|
Publisher | Statistics Netherlands |
Keywords
- latent class models
- multiple imputation
- measurement errors
- multisource statistics