Predictive mean matching

From HandWiki

Predictive mean matching (PMM)[1] is a widely used[2] statistical imputation method for missing values, first proposed by Donald B. Rubin in 1986[3] and R. J. A. Little in 1988.[4]

It aims to reduce the bias introduced in a dataset through imputation, by drawing real values sampled from the data.[5] This is achieved by building a small subset of observations where the outcome variable matches the outcome of the observations with missing values.[1]

Compared to other imputation methods, it usually imputes less implausible values (e.g. negative incomes) and takes heteroscedastic data into account more appropriately.[6]

References

  1. 1.0 1.1 "3.4 Predictive mean matching". http://stefvanbuuren.name/fimd/sec-pmm.html. Retrieved 30 June 2019. 
  2. "Web of Science [v.5.32] – All Databases Results". https://apps.webofknowledge.com/Search.do?product=UA&SID=C4t3fjbEtlzE2IBRqfF&search_mode=GeneralSearch&prID=ab90c05a-ae5f-4d37-9f59-45f422547688. Retrieved 30 June 2019. 
  3. Rubin, Donald B. (30 June 1986). "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations". Journal of Business & Economic Statistics 4 (1): 87–94. doi:10.2307/1391390. 
  4. Little, Roderick J. A. (30 June 1988). "Missing-Data Adjustments in Large Surveys". Journal of Business & Economic Statistics 6 (3): 287–296. doi:10.2307/1391878. 
  5. "Imputation by Predictive Mean Matching: Promise & Peril – Statistical Horizons". https://statisticalhorizons.com/predictive-mean-matching. Retrieved 30 June 2019. 
  6. "Predictive Mean Matching Imputation (Example in R)" (in en-US). https://statisticsglobe.com/predictive-mean-matching-imputation-method/.