Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Classification of Imbalanced Credit Data Sets with Borrower-Specific Cost-Sensitive Algorithms
Download
Phd_Thesis_YaseminYK.pdf
Date
2023-6-02
Author
Yaman Kanmaz, Yasemin.
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
708
views
102
downloads
Cite This
The unequal class distributions result in two types of prediction errors that incur different costs in imbalanced credit data sets. These are monetary losses for the misclassified defaults and opportunity cost of interest income for the misclassified non-defaults. Addressing these issues, this study proposes a novel approach to costsensitive learning and imbalanced data classification in credit data sets, using new borrower (instance)-specific cost/risk parameters to solve these two types of asymmetries. The main objective of this study is to create a weight-signaling risk level for each instance by revealing instance-embedded information to strengthen ordinary algorithms with the generated weight and breaking the dominance of the majority class in the loss functions. The default probabilities of credit applicants provide valuable information about their risk level, and thus new instance-specific cost/risk parameters based on their default risk levels are proposed instead of class-specific ratios. Default probabilities are estimated with sampled sub-datasets, and before this step, analyses for the optimal class ratio of sub-datasets are conducted with the Simulated Annealing stochastic process. To estimate the default probabilities, non-linear complex models like logistic regressions, deep learning-based Graph Neural Networks, and Graph Attention Networks are employed. Three cost/risk parameters are generated with the target of equalizing the class losses based on their class-based default risk level aggregations. AdaBoost, XGBoost, and ANN algorithms are then modified to incorporate these new parameters and the empirical analyses are conducted using eight credit data sets. The success of the proposed algorithms is particularly evident in the classification of data sets where the class ratios increase. The comparison analyses indicate that given Specificity values, the decrease in the monetary loss by new cost-sensitive algorithms can reach 33.7 % in the data set with the highest class imbalance.
Subject Keywords
Instance-specific
,
Default probability
,
Logistic regression
,
Graph neural networks
,
Graph attention networks
,
Articificial neural networks
,
XGBoost
,
AdaBoost
URI
https://hdl.handle.net/11511/104424
Collections
Graduate School of Applied Mathematics, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
Y. Yaman Kanmaz, “Classification of Imbalanced Credit Data Sets with Borrower-Specific Cost-Sensitive Algorithms,” Ph.D. - Doctoral Program, Middle East Technical University, 2023.