Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
A cluster tree based model selection approach for logistic regression classifier
Date
2018-01-01
Author
Tanju, Ozge
Kalaylıoğlu Akyıldız, Zeynep Işıl
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
205
views
0
downloads
Cite This
Model selection methods are important to identify the best approximating model. To identify the best meaningful model, purpose of the model should be clearly pre-stated. The focus of this paper is model selection when the modelling purpose is classification. We propose a new model selection approach designed for logistic regression model selection where main modelling purpose is classification. The method is based on the distance between the two clustering trees. We also question and evaluate the performances of conventional model selection methods based on information theory concepts in determining best logistic regression classifier. An extensive simulation study is used to assess the finite sample performances of the cluster tree based and the information theoretic model selection methods. Simulations are adjusted for whether the true model is in the candidate set or not. Results show that the new approach is highly promising. Finally, they are applied to a real data set to select a binary model as a means of classifying the subjects with respect to their risk of breast cancer.
Subject Keywords
Statistics, Probability and Uncertainty
,
Modelling and Simulation
,
Statistics and Probability
,
Applied Mathematics
URI
https://hdl.handle.net/11511/36456
Journal
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
DOI
https://doi.org/10.1080/00949655.2018.1437442
Collections
Department of Statistics, Article
Suggestions
OpenMETU
Core
Bayesian semiparametric models for nonignorable missing mechanisms in generalized linear models
Kalaylıoğlu Akyıldız, Zeynep Işıl (Informa UK Limited, 2013-08-01)
Semiparametric models provide a more flexible form for modeling the relationship between the response and the explanatory variables. On the other hand in the literature of modeling for the missing variables, canonical form of the probability of the variable being missing (p) is modeled taking a fully parametric approach. Here we consider a regression spline based semiparametric approach to model the missingness mechanism of nonignorably missing covariates. In this model the relationship between the suitable...
Extended lasso-type MARS (LMARS) model in the description of biological network
Agraz, Melih; Purutçuoğlu Gazi, Vilda (Informa UK Limited, 2019-01-02)
The multivariate adaptive regression splines (MARS) model is one of the well-known, additive non-parametric models that can deal with highly correlated and nonlinear datasets successfully. From our previous analyses, we have seen that lasso-type MARS (LMARS) can be a strong alternative of the Gaussian graphical model (GGM) which is a well-known probabilistic method to describe the steady-state behaviour of the complex biological systems via the lasso regression. In this study, we extend our original LMARS m...
Multiple linear regression model with stochastic design variables
İslam, Muhammed Qamarul (Informa UK Limited, 2010-01-01)
In a simple multiple linear regression model, the design variables have traditionally been assumed to be non-stochastic. In numerous real-life situations, however, they are stochastic and non-normal. Estimators of parameters applicable to such situations are developed. It is shown that these estimators are efficient and robust. A real-life example is given.
Models of response error components in supervised interview-reinterview surveys
Ayhan, Hüseyin Öztaş (Informa UK Limited, 2003-11-01)
The current work deals with modelling of response error components in supervised interview-reinterview surveys. The model considers several stages of an interactive process to obtain and record a response. The response process is evaluated as, controller-interviewer-respondent-interviewer-controller interaction setting under a supervised interviewing process. The allocation of controllers, interviewers and respondents is made by a hierarchical design for the interview-reinterview process. In addition, a cod...
A marginalized multilevel model for bivariate longitudinal binary data
Inan, Gul; İlk Dağ, Özlem (Springer Science and Business Media LLC, 2019-06-01)
This study considers analysis of bivariate longitudinal binary data. We propose a model based on marginalized multilevel model framework. The proposed model consists of two levels such that the first level associates the marginal mean of responses with covariates through a logistic regression model and the second level includes subject/time specific random intercepts within a probit regression model. The covariance matrix of multiple correlated time-specific random intercepts for each subject is assumed to ...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
O. Tanju and Z. I. Kalaylıoğlu Akyıldız, “A cluster tree based model selection approach for logistic regression classifier,”
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
, pp. 1394–1414, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/36456.