Hide/Show Apps

Bayesian semiparametric models for nonignorable missing datamechanisms in logistic regression

Öztürk, Olcay
In this thesis, Bayesian semiparametric models for the missing data mechanisms of nonignorably missing covariates in logistic regression are developed. In the missing data literature, fully parametric approach is used to model the nonignorable missing data mechanisms. In that approach, a probit or a logit link of the conditional probability of the covariate being missing is modeled as a linear combination of all variables including the missing covariate itself. However, nonignorably missing covariates may not be linearly related with the probit (or logit) of this conditional probability. In our study, the relationship between the probit of the probability of the covariate being missing and the missing covariate itself is modeled by using a penalized spline regression based semiparametric approach. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm to estimate the parameters is established. A WinBUGS code is constructed to sample from the full conditional posterior distributions of the parameters by using Gibbs sampling. Monte Carlo simulation experiments under different true missing data mechanisms are applied to compare the bias and efficiency properties of the resulting estimators with the ones from the fully parametric approach. These simulations show that estimators for logistic regression using semiparametric missing data models maintain better bias and efficiency properties than the ones using fully parametric missing data models when the true relationship between the missingness and the missing covariate has a nonlinear form. They are comparable when this relationship has a linear form.