BenchMetrics: a systematic benchmarking method for binary classification performance metrics

Canbek, Gurol
Taşkaya Temizel, Tuğba
This paper proposes a systematic benchmarking method called BenchMetrics to analyze and compare the robustness of binary classification performance metrics based on the confusion matrix for a crisp classifier. BenchMetrics, introducing new concepts such as meta-metrics (metrics about metrics) and metric space, has been tested on fifteen well-known metrics including balanced accuracy, normalized mutual information, Cohen's Kappa, and Matthews correlation coefficient (MCC), along with two recently proposed metrics, optimized precision and index of balanced accuracy in the literature. The method formally presents a pseudo-universal metric space where all the permutations of confusion matrix elements yielding the same sample size are calculated. It evaluates the metrics and metric spaces in a two-staged benchmark based on our proposed eighteen new criteria and finally ranks the metrics by aggregating the criteria results. The mathematical evaluation stage analyzes metrics' equations, specific confusion matrix variations, and corresponding metric spaces. The second stage, including seven novel meta-metrics, evaluates the robustness aspects of metric spaces. We interpreted each benchmarking result and comparatively assessed the effectiveness of BenchMetrics with the limited comparison studies in the literature. The results of BenchMetrics have demonstrated that widely used metrics have significant robustness issues, and MCC is the most robust and recommended metric for binary classification performance evaluation.


Similarity matrix framework for data from union of subspaces
Aldroubi, Akram; Sekmen, Ali; Koku, Ahmet Buğra; Cakmak, Ahmet Faruk (2018-09-01)
This paper presents a framework for finding similarity matrices for the segmentation of data W = [w(1)...w(N)] subset of R-D drawn from a union U = boolean OR(M)(i=1) S-i, of independent subspaces {S-i}(i=1)(M), of dimensions {d(i)}(i=1)(M). It is shown that any factorization of W = BP, where columns of B form a basis for data W and they also come from U, can be used to produce a similarity matrix Xi w. In other words, Xi w(i, j) not equal 0, when the columns w(i) and w(j) of W come from the same subspace, ...
Waterfall region analysis for iterative decoding
Yılmaz, Ali Özgür (2004-12-01)
Finite length analysis of iterative decoders can be done by using probabilistic models based on EXIT charts. The validity of these models will be investigated by checking the performance of iterative decoding under various scenarios.
Continuous dimensionality characterization of image structures
Felsberg, Michael; Kalkan, Sinan; Kruger, Norbert (Elsevier BV, 2009-05-04)
Intrinsic dimensionality is a concept introduced by statistics and later used in image processing to measure the dimensionality of a data set. In this paper, we introduce a continuous representation of the intrinsic dimension of an image patch in terms of its local spectrum or, equivalently, its gradient field. By making use of a cone structure and barycentric co-ordinates, we can associate three confidences to the three different ideal cases of intrinsic dimensions corresponding to homogeneous image patche...
Cooperative terrain based navigation and coverage identification using consensus
Kasebzadeh, Parinaz; Fritsche, Carsten; Özkan, Emre; Gunnarsson, Fredrik; Gustafsson, Fredrik ( Institute of Electrical and Electronics Engineers Inc.; 2015-07-06)
This paper presents a distributed online method for joint state and parameter estimation in a Jump Markov NonLinear System based on a distributed recursive Expectation Maximization algorithm. State inference is enabled via the use of Rao-Blackwellized Particle Filter and, for the parameter estimation, the E-step is performed independently at each sensor with the calculation of local sufficient statistics. An average consensus algorithm is used to diffuse local sufficient statistics to neighbors and approxim...
Optimising a nonlinear utility function in multi-objective integer programming
Ozlen, Melih; Azizoğlu, Meral; Burton, Benjamin A. (2013-05-01)
In this paper we develop an algorithm to optimise a nonlinear utility function of multiple objectives over the integer efficient set. Our approach is based on identifying and updating bounds on the individual objectives as well as the optimal utility value. This is done using already known solutions, linear programming relaxations, utility function inversion, and integer programming. We develop a general optimisation algorithm for use with k objectives, and we illustrate our approach using a tri-objective i...
Citation Formats
G. Canbek, T. Taşkaya Temizel, and Ş. SAĞIROĞLU, “BenchMetrics: a systematic benchmarking method for binary classification performance metrics,” NEURAL COMPUTING & APPLICATIONS, pp. 0–0, 2021, Accessed: 00, 2021. [Online]. Available: