Processing of erroneous and unsafe data

In this dissertation, we examine two different, but related, topics. The first topic is statistical data editing, which occurs during data processing and analysis. The goal of statistical data editing is to detect and correct incorrect data. To achieve this goal, the observed data are enriched through subject matter expertise and statistical analysis. In effect, we are trying to create more information than we have observed.
The second topic is statistical disclosure control, which occurs at the end of the statistical process. The goal of statistical dsclosure control is to prevent sensitive information about individual respondents, or small groups of respondents, from being derived from published data. To achieve this goal, data are often removed, or the information in the data is reduced by adding noise or collapsing (recoding) variables. So here we are basically trying to reduce the information in the data.
We first describe some well-known techniques for making the data editing process efficient. In later chapters we concentrate mainly on the so-called error localisation problem, that is, the problem of detecting incorrect data, for a mixture of categorical and continuous data. In the second part of this book, we concentrate on statistical disclosure control. In a general overview of the field, we focus on a general approach to statistical disclosure control of microdata, that is, the data of individual respondents, of social statistics in particular that is applied at various statistical institutes including CBS. This general approach is elaborated upon in subsequent chapters.
Waal, A. G. de (2003). Processing of erroneous and unsafe data. Dissertation, Erasmus University Rotterdam.
Downloads
- PDF - Dissertation Waal 2003