A matheuristic for binary classification of data sets using hyperboxes

In this study, an optimization approach is proposed for the binary classification problem. A Mixed Integer Programming (MIP) model formulation is used to construct hyperboxes as classifiers, minimizing the number of misclassified and unclassified samples as well as overlapping of hyperboxes. The hyperboxes are determined by some lower and upper bounds on the feature values, and overlapping of these hyperboxes is allowed to keep a balance between misclassification and overfitting. A matheuristic, namely Iterative Classification procedure for Binary classes (ICB) is developed based on the MIP formulation. In each iteration of the ICB algorithm, a fixed number of hyperboxes are generated using the MIP model, and then a trimming algorithm is used to adjust the hyperboxes in a way to eliminate the misclassified samples. Some trimmed hyperboxes and sample assignments are then fixed, reducing the unclassified sample size left for the next iteration. ICB controls the number of hyperboxes in a greedy manner, but provides an overall hyperbox configuration with no misclassification by the end of the training phase. For the test phase, distance-based heuristic algorithms are also developed to classify the uncovered and overlap samples that are not classified by the hyperboxes.
Citation Formats
D. Akbulut, C. İyigün, and N. E. Özdemirel, “A matheuristic for binary classification of data sets using hyperboxes,” presented at the EURO 2018 - 29th European Conference on Operational Research, July 8-11 2018, Valencia, Spain, 2018, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/86765.