Overlapping region-driven sample generation for imbalanced learning

2026-1
Öztürk, Buğra
The classification of the imbalanced datasets is one of the most challenging application in the machine learning. The difficulty of this application caused by insufficient representation of the minority class in the dataset. This degrades the performance of the classification algorithm and makes it biased toward the majority class and suppresses the minority class. While several oversampling algorithms have been introduced to address this issue, most of them overlook the intrinsic structure of the dataset, specifically the overlapping region of the dataset. The overlapping region is the primary factor that triggers the imbalance learning problem. This thesis proposes an approach that prioritizes the features of the overlapping region in sample generation. Additionally, the proposed approach is enhanced by clustering to discover the local structures in the overlapping region. Based on the proposed approach, 16 algorithms are introduced, which can be categorized into four distinct sampling strategies, comprising 6 non-clustering-based and 10 clustering-based algorithms. These algorithms identify overlapping region, assign weight to samples depending on their location, and generate samples by utilizing the introduced sampling strategies. The effectiveness of the proposed algorithms is evaluated across 20 various datasets and utilizing 6 performance metrics: overall accuracy, minority and majority class accuracy, F1-Score, G-mean, and Equitable Accuracy Score (EAS). The results demonstrate the outperforming performance of the proposed algorithms compared to SMOTE, G-SMOTE, and Borderline-SMOTE. These findings validate that prioritizing the overlapping region in oversampling significantly improves the performance of the classifier in terms of accuracy and fairness between the accuracy of the classes.
Citation Formats
B. Öztürk, “Overlapping region-driven sample generation for imbalanced learning,” M.S. - Master of Science, Middle East Technical University, 2026.