Balanced k-Nearest Neighbor Imputation
Date issued
May 22, 2016
In
Statistics
No
105
From page
11
To page
23
Subjects
missing data nonresponse sampling balanced sampling calibration nearest neighbors
Abstract
In order to overcome the problem of item nonresponse, random imputation methods are often used because they tend to preserve the distribution of the imputed variable. Among the random i.mputation methods, the random hot-deck has the interesting property of imputing observed values. A new random hot-deck imputation method is proposed. The key innovation of this method is that the selection of donors is viewed as a sampling problem and uses calibration and balanced sampling. This approach makes it possible to select donors such that if the auxiliary variables were imputed, their estimated totals would not change. As a consequence, very accurate and stable totals estimations can be obtained. Moreover, donors are selected in neighborhoods of recipients. In this way, the missing value of a recipient is replaced with an observed value of a similar unit. This second approach can greatly improve the quality of estimations. Finally, these two approaches imply underlying models and the method is resistent to model misspecification.
Publication type
journal article
