Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Multi-perspective analysis and systematic benchmarking for binary-classification performance evaluation instruments
Download
index.pdf
Date
2019
Author
Canbek, Gürol
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
275
views
175
downloads
Cite This
This thesis proposes novel methods to analyze and benchmark binary-classification performance evaluation instruments. It addresses critical problems found in the literature, clarifies terminology and distinguishes instruments as measure, metric, and as a new category indicator for the first time. The multi-perspective analysis introduces novel concepts such as canonical form, geometry, duality, complementation, dependency, and leveling with formal definitions as well as two new basic instruments. An indicator named Accuracy Barrier is also proposed and tested in re-evaluating performances of surveyed machine-learning classifications. An exploratory table is designed to represent all the concepts for over 50 instruments. The table’s real use cases such as domain-specific metrics reporting are demonstrated. Furthermore, this thesis proposes a systematic benchmarking method comprising 3 stages to assess metrics’ robustness over new concepts such as metametrics (metrics about metrics) and metric-space. Benchmarking 13 metrics reveals significant issues especially in accuracy, F1, and normalized mutual information conventional metrics and identifies Matthews Correlation Coefficient as the most robust metric. The benchmarking method is evaluated with the literature. Additionally, this thesis formally demonstrates publication and confirmation biases due to reporting non-robust metrics. Finally, this thesis gives recommendations on precise and concise performance evaluation, comparison, and reporting. The developed software library, analysis/benchmarking platform, visualization and calculator/dashboard tools, and datasets were also released online. This research is expected to re-establish and facilitate classification performance evaluation domain as well as contribute towards responsible open research in performance evaluation to use the most robust and objective instruments.
Subject Keywords
Information technology.
,
Keywords: Binary-classification
,
performance evaluation
,
performance metrics
,
machine learning
,
artificial intelligence
URI
http://etd.lib.metu.edu.tr/upload/12623895/index.pdf
https://hdl.handle.net/11511/45061
Collections
Graduate School of Informatics, Thesis
Suggestions
OpenMETU
Core
PToPI: A Comprehensive Review, Analysis, and Knowledge Representation of Binary Classification Performance Measures/Metrics
Canbek, Gürol; Taşkaya Temizel, Tuğba; SAĞIROĞLU, ŞEREF (2023-1-01)
Although few performance evaluation instruments have been used conventionally in different machine learning-based classification problem domains, there are numerous ones defined in the literature. This study reviews and describes performance instruments via formally defined novel concepts and clarifies the terminology. The study first highlights the issues in performance evaluation via a survey of 78 mobile-malware classification studies and reviews terminology. Based on three research questions, it propose...
Designing energy-efficient high-precision multi-pass turning processes via robust optimization and artificial intelligence
Khalilpourazari, Soheyl; Khalilpourazary, Saman; ÇİFTÇİOĞLU, AYBİKE ÖZYÜKSEL; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-09-01)
This paper suggests a novel robust formulation designed for optimizing the parameters of the turning process in an uncertain environment for the first time. The aim is to achieve the lowest energy consumption and highest precision. With this aim, the current paper considers uncertain parameters, objective functions, and constraints in the offered mathematical model. We proposed several uncertain models and validated the results in real-world case studies. In addition, several artificial intelligence-based s...
Multipath Characteristics of Frequency Diverse Arrays Over a Ground Plane
Cetintepe, Cagri; Demir, Şimşek (Institute of Electrical and Electronics Engineers (IEEE), 2014-07-01)
This paper presents a theoretical framework for an analytical investigation of multipath characteristics of frequency diverse arrays (FDAs), a task which is attempted for the first time in the open literature. In particular, transmitted field expressions are formulated for an FDA over a perfectly conducting ground plane first in a general analytical form, and these expressions are later simplified under reasonable assumptions. Developed formulation is then applied to a uniform, linear, continuous-wave opera...
Stable controller design for T-S fuzzy systems based on Lie algebras
Banks, SP; Gurkan, E; Erkmen, İsmet (Elsevier BV, 2005-12-01)
In this paper, we study the stability of fuzzy control systems of Takagi-Sugeno-(T-S) type based on the classical theory of Lie algebras. T-S fuzzy systems are used to model nonlinear systems as a set of rules with consequents of the type x(t) = A(l)x (t) + B(l)u (t). We conduct the stability analysis of such T-S fuzzy models using the Lie algebra LA generated by the A(l) matrices of these subsystems for each rule in the rule base. We first develop our approach of stability analysis for a commuting algebra ...
An experimental study for simulation based assessment of information system design performance
Ayyildiz, Bulent; Akman, Ibrahim; Arifoğlu, Ali (2007-07-04)
This paper presents an experimental study for evaluating the decision support value of queueing network (QN) based simulation models for information system design performance. For illustration, queueing network simulation models have been extracted corressponding to three annotated design alternatives of a selected case study. The design alternatives are produced using logical requirements of the selected system. The performance of each alternative is then predicted using quantifiable parameters considering...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
G. Canbek, “Multi-perspective analysis and systematic benchmarking for binary-classification performance evaluation instruments,” Thesis (Ph.D.) -- Graduate School of Natural and Applied Sciences. Information Systems., Middle East Technical University, 2019.