A common problem faced by statistical offices is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact that statistical data often have to satisfy certain edit rules, which for numerical data usually take the form of linear restrictions. Standard imputation methods generally do not take such edit restrictions into account. In the present article we describe two general approaches for imputation of missing numerical data that do take the edit restrictions into account. The first approach imputes the missing values by means of an imputation method and afterwards adjusts the imputed values so they satisfy the edit restrictions. The second approach sequentially imputes the missing data. It uses Fourier-Motzkin elimination to determine appropriate intervals for each variable to be imputed. Both approaches are not based on a specific imputation model, but allow one to specify an imputation model. To illustrate the two approaches we assume that the data are approximately multivariately normally distributed. To assess the performance of the imputation approaches an evaluation study is carried out.
|Journal||Sort-Statistics and Operations Research Transactions|
|Publication status||Published - 2011|