Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Comparative performance analysis of variable selection methods in linear models: A full factorial design simulation study
Download
MehmetBiTezSon.pdf
Date
2024-7-30
Author
Bi, Mehmet
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
79
views
43
downloads
Cite This
Variable selection is an important preprocessing step in statistical modeling, aimed at improving model performance by identifying the most relevant variables. Despite the abundance of variable selection techniques, there remains a gap in understanding their comparative effectiveness across diverse datasets and conditions. Therefore, in this study we systematically evaluate a wide range of variable selection methods, covering all types of methods, filter, wrapper, and embedded with widely known methods. By employing a full factorial design (64 scenarios), we examine the interactions between different factors and various dataset characteristics, such as sample size, number of variables, and variable correlation, error and outlier. This robust experimental framework allows for an in-depth assessment of each method performance, considering multiple evaluation metrics including accuracy, test and train error. The results reveal significant insights into the strengths and limitations of each variable selection method, providing practical guidance for practitioners in choosing the most appropriate technique for their specific applications. Furthermore, the findings highlight the importance of context-dependent method selection, emphasizing that no single variable selection method universally outperforms others across all scenarios. Among selected variable selection methods, results revealed Least Absolute Shrinkage and Selection Operator (LASSO), Forward Feature Selection and Recursive Feature Elimination (RFE) are the suggested candidates depending on the data characteristics. Overall, this study contributes to the field of statistics by offering a case-specisific manual and a thorough statistical evaluation of variable selection methods, thereby aiding in the development of more efficient and accurate predictive models.
Subject Keywords
Simulation
,
Variable selection
,
Classification accuracy
,
Model performance
,
Synthetic data
URI
https://hdl.handle.net/11511/110503
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
M. Bi, “Comparative performance analysis of variable selection methods in linear models: A full factorial design simulation study,” M.S. - Master of Science, Middle East Technical University, 2024.