Imputation of restricted data: Applications to business suveys

Cover, Imputation of restricted data, Caren Tempelman
© CBS
Dissertation on models developed to impute business data, many of which are subject to linear equality and inequality constraints.
The focus of this thesis is on the imputation of (economic) data that are subject to different types of linear restrictions. In this study several imputation procedures are developed and analysed in order to provide the imputer with a set of models that can be used for varying types of restrictions structures and datasets. We develop an imputation method that uses the Dirichlet distribution to model the data. This method is convenient because of its flexibility. This procedure can impute data items that are non-negative and subject to one linear balance restriction. It cannot incorporate multiple balance restrictions, however. Therefore, we suggest the use of the multivariate singular normal distribution. It is found that the EM algorithm can be extended such that singular normal data can be managed as well. This imputation procedure is easy to implement and whose properties are well-known.

As inequality restrictions are not incorporated in the singular normal model, there is still the need for a general purpose method that can handle all sorts of balance and inequality restrictions. With this objective, the multivariate singular normal density is truncated to the region defined by the inequality restrictions. This trunctated singular normal distribution consists of high-dimensional integrals and consequently leads to complex modeling issues. In a completely different approach, the joint model is split into a sequence of univariate conditional distributions. These univariate conditional models are used to sequentially impute each variable. This model can inciorporate both balance and inequality restrictions simultaneously as well.

Tempelman, D. C. G. (2007). Imputation of restricted data: Applications to business surveys. Dissertation, University of Groningen.