A simulation study on the comparison of methods for the analysis of longitudinal count data

İnan, Gül
The longitudinal feature of measurements and counting process of responses motivate the regression models for longitudinal count data (LCD) to take into account the phenomenons such as within-subject association and overdispersion. One common problem in longitudinal studies is the missing data problem, which adds additional difficulties into the analysis. The missingness can be handled with missing data techniques. However, the amount of missingness in the data and the missingness mechanism that the data have affect the performance of missing data techniques. In this thesis, among the regression models for LCD, the Log-Log-Gamma marginalized multilevel model (Log-Log-Gamma MMM) and the random-intercept model are focused on. The performance of the models is compared via a simulation study under three missing data mechanisms (missing completely at random, missing at random conditional on observed data, and missing not random), two types of missingness percentage (10% and 20%), and four missing data techniques (complete case analysis, subject, occasion and conditional mean imputation). The simulation study shows that while the mean absolute error and mean square error values of Log-Log-Gamma MMM are larger in amount compared to the random-intercept model, both regression models yield parallel results. The simulation study results justify that the amount of missingness in the data and that the missingness mechanism that the data have, strictly influence the performance of missing data techniques under both regression models. Furthermore, while generally occasion mean imputation displays the worst performance, conditional mean imputation shows a superior performance over occasion and subject mean imputation and gives parallel results with complete case analysis.


