Abstract
Missing data is a common problem in epidemiologic studies and is often addressed by omitting incomplete records or adopting multiple imputation. Although these methods can produce unbiased estimates of study associations, their validity becomes problematic when data are missing not at random (MNAR), and the missing data mechanism is nonignorable. This situation typically arises when the presence of missing values depends on characteristics of the measurement or recording process, which is common in surveys and databases with electronic healthcare records. In this article, we discuss the relevance and implementation of Heckman selection models to impute variables that are missing not at random.
Original language | English |
---|---|
Pages (from-to) | 5-13 |
Number of pages | 9 |
Journal | International Journal of Epidemiology |
Volume | 52 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1 Feb 2023 |
Keywords
- Heckman selection model
- exclusion restriction variables
- selection bias
- missing data
- causal inference
- real world data