Multi-perspective analysis and systematic benchmarking for binary-classification performance evaluation instruments

Download
2019
Canbek, Gürol
This thesis proposes novel methods to analyze and benchmark binary-classification performance evaluation instruments. It addresses critical problems found in the literature, clarifies terminology and distinguishes instruments as measure, metric, and as a new category indicator for the first time. The multi-perspective analysis introduces novel concepts such as canonical form, geometry, duality, complementation, dependency, and leveling with formal definitions as well as two new basic instruments. An indicator named Accuracy Barrier is also proposed and tested in re-evaluating performances of surveyed machine-learning classifications. An exploratory table is designed to represent all the concepts for over 50 instruments. The table’s real use cases such as domain-specific metrics reporting are demonstrated. Furthermore, this thesis proposes a systematic benchmarking method comprising 3 stages to assess metrics’ robustness over new concepts such as metametrics (metrics about metrics) and metric-space. Benchmarking 13 metrics reveals significant issues especially in accuracy, F1, and normalized mutual information conventional metrics and identifies Matthews Correlation Coefficient as the most robust metric. The benchmarking method is evaluated with the literature. Additionally, this thesis formally demonstrates publication and confirmation biases due to reporting non-robust metrics. Finally, this thesis gives recommendations on precise and concise performance evaluation, comparison, and reporting. The developed software library, analysis/benchmarking platform, visualization and calculator/dashboard tools, and datasets were also released online. This research is expected to re-establish and facilitate classification performance evaluation domain as well as contribute towards responsible open research in performance evaluation to use the most robust and objective instruments.

Suggestions

Designing energy-efficient high-precision multi-pass turning processes via robust optimization and artificial intelligence
Khalilpourazari, Soheyl; Khalilpourazary, Saman; ÇİFTÇİOĞLU, AYBİKE ÖZYÜKSEL; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-09-01)
This paper suggests a novel robust formulation designed for optimizing the parameters of the turning process in an uncertain environment for the first time. The aim is to achieve the lowest energy consumption and highest precision. With this aim, the current paper considers uncertain parameters, objective functions, and constraints in the offered mathematical model. We proposed several uncertain models and validated the results in real-world case studies. In addition, several artificial intelligence-based s...
Multipath Characteristics of Frequency Diverse Arrays Over a Ground Plane
Cetintepe, Cagri; Demir, Şimşek (Institute of Electrical and Electronics Engineers (IEEE), 2014-07-01)
This paper presents a theoretical framework for an analytical investigation of multipath characteristics of frequency diverse arrays (FDAs), a task which is attempted for the first time in the open literature. In particular, transmitted field expressions are formulated for an FDA over a perfectly conducting ground plane first in a general analytical form, and these expressions are later simplified under reasonable assumptions. Developed formulation is then applied to a uniform, linear, continuous-wave opera...
An experimental study for simulation based assessment of information system design performance
Ayyildiz, Bulent; Akman, Ibrahim; Arifoğlu, Ali (2007-07-04)
This paper presents an experimental study for evaluating the decision support value of queueing network (QN) based simulation models for information system design performance. For illustration, queueing network simulation models have been extracted corressponding to three annotated design alternatives of a selected case study. The design alternatives are produced using logical requirements of the selected system. The performance of each alternative is then predicted using quantifiable parameters considering...
Stable controller design for T-S fuzzy systems based on Lie algebras
Banks, SP; Gurkan, E; Erkmen, İsmet (Elsevier BV, 2005-12-01)
In this paper, we study the stability of fuzzy control systems of Takagi-Sugeno-(T-S) type based on the classical theory of Lie algebras. T-S fuzzy systems are used to model nonlinear systems as a set of rules with consequents of the type x(t) = A(l)x (t) + B(l)u (t). We conduct the stability analysis of such T-S fuzzy models using the Lie algebra LA generated by the A(l) matrices of these subsystems for each rule in the rule base. We first develop our approach of stability analysis for a commuting algebra ...
Improving reinforcement learning by using sequence trees
Girgin, Sertan; Polat, Faruk; Alhajj, Reda (Springer Science and Business Media LLC, 2010-12-01)
This paper proposes a novel approach to discover options in the form of stochastic conditionally terminating sequences; it shows how such sequences can be integrated into the reinforcement learning framework to improve the learning performance. The method utilizes stored histories of possible optimal policies and constructs a specialized tree structure during the learning process. The constructed tree facilitates the process of identifying frequently used action sequences together with states that are visit...
Citation Formats
G. Canbek, “Multi-perspective analysis and systematic benchmarking for binary-classification performance evaluation instruments,” Thesis (Ph.D.) -- Graduate School of Natural and Applied Sciences. Information Systems., Middle East Technical University, 2019.