Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Overlapping region-driven sample generation for imbalanced learning
Download
BugraOzturk_Tez.pdf
BUĞRA ÖZTÜRK.pdf
Date
2026-1
Author
Öztürk, Buğra
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
46
views
0
downloads
Cite This
The classification of the imbalanced datasets is one of the most challenging application in the machine learning. The difficulty of this application caused by insufficient representation of the minority class in the dataset. This degrades the performance of the classification algorithm and makes it biased toward the majority class and suppresses the minority class. While several oversampling algorithms have been introduced to address this issue, most of them overlook the intrinsic structure of the dataset, specifically the overlapping region of the dataset. The overlapping region is the primary factor that triggers the imbalance learning problem. This thesis proposes an approach that prioritizes the features of the overlapping region in sample generation. Additionally, the proposed approach is enhanced by clustering to discover the local structures in the overlapping region. Based on the proposed approach, 16 algorithms are introduced, which can be categorized into four distinct sampling strategies, comprising 6 non-clustering-based and 10 clustering-based algorithms. These algorithms identify overlapping region, assign weight to samples depending on their location, and generate samples by utilizing the introduced sampling strategies. The effectiveness of the proposed algorithms is evaluated across 20 various datasets and utilizing 6 performance metrics: overall accuracy, minority and majority class accuracy, F1-Score, G-mean, and Equitable Accuracy Score (EAS). The results demonstrate the outperforming performance of the proposed algorithms compared to SMOTE, G-SMOTE, and Borderline-SMOTE. These findings validate that prioritizing the overlapping region in oversampling significantly improves the performance of the classifier in terms of accuracy and fairness between the accuracy of the classes.
Subject Keywords
Machine learning
,
Imbalanced learning
,
Oversampling
,
Overlapping region
,
Clustering
URI
https://hdl.handle.net/11511/118460
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
B. Öztürk, “Overlapping region-driven sample generation for imbalanced learning,” M.S. - Master of Science, Middle East Technical University, 2026.