Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
IRREGULAR LONGITUDINAL DATA ANALYSIS WITH STATISTICAL AND MACHINE LEARNING METHODS IN ASTEROID DATASET
Download
İrem Tanrıverdi thesis 10576097.pdf
Date
2023-9-11
Author
Tanrıverdi, İrem
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
193
views
49
downloads
Cite This
During the 18-th century, scientific research on asteroids began to gain recognition and importance. Records are kept of the characteristics of asteroids that entered Earth's orbit, and their hazardous status is classified. It is crucial to use appropriate analysis methods and account for the longitudinal structure of the data. Unfortunately, previous studies used methods that ignore data dependency in Near-Earth Asteroids (NEA) data. Therefore, this thesis proposes various statistical and machine learning methods on NEA data to overcome these shortcomings. We analyze data from 751 asteroids observed at irregular time intervals through the National Aeronautics and Space Administration (NASA). We compare algorithms suitable for longitudinal data structure, such as the Generalized Linear Mixed Models (GLMM), marginal model, GLMM-Tree, Historical Random Forest, GPBoost, and Spline. According to the findings, the accuracies of the models range from 0.89 and 0.99. The GPBoost model has the highest performance, while the marginal model has the poorest performance. Then, NEA data is simulated with different subject sizes and regular time points. As a result, the model performances increase as the subject and time sizes increase. The model with the highest performance is GPBoost, while the model with the poorest performance is GLMM-Tree for small sample sizes.
Subject Keywords
Marginal Models, Machine Learning Algorithms, Irregular Time Points, Generalized Linear Mixed Models, A-Spline, P-Spline, Near Earth Asteroids, Astronomy, Decision Trees, Random Forest, GPBoost, Longitudinal data, Correlated data
,
Inverse Intensity Weighting
URI
https://hdl.handle.net/11511/105329
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
İ. Tanrıverdi, “IRREGULAR LONGITUDINAL DATA ANALYSIS WITH STATISTICAL AND MACHINE LEARNING METHODS IN ASTEROID DATASET,” M.S. - Master of Science, Middle East Technical University, 2023.