APPLICATION OF CLASSICAL AND MACHINE LEARNING MODELS ON LONGITUDINAL DATA WITH BINARY RESPONSE

Download
2023-9-11
Arslan, Rümeysa Rana
Alzheimer’s disease (AD) is a significant global health issue that affects both individuals and society for older adults. The symptoms of the disease can be observed over time, making the structure longitudinal. Classical statistical models and machine learning algorithms can be used to analyze these datasets. This study consists of two parts: First, a real dataset is used to find the features affecting dementia status and compare the performances of models. Secondly, a simulation study based on the real dataset with a different number of subjects and an equal number of time points for each subject is conducted to apply and compare the model performances. The classical mixed models, their extended versions, and hybrid models, Boruta, GEE, GLMM, HGLM, GLMMLasso, GPBoost, GLMMTree, and HRF are used for both parts. As a result, GPBoost learns and classifies the dementia status well but overfits due to the small sample size in the dataset, and tree-based algorithms are efficient in predicting the dementia status when a new subject enters the study for the real dataset. For the simulation study, all methods have similar results, but HGLM, GPBoost and GLMMLasso algorithms have better performances regardless of the sample size and balance of the dataset.
Citation Formats
R. R. Arslan, “APPLICATION OF CLASSICAL AND MACHINE LEARNING MODELS ON LONGITUDINAL DATA WITH BINARY RESPONSE,” M.S. - Master of Science, Middle East Technical University, 2023.