Independently weighted value difference metric

Date

2017-10-01

Author

Ortakaya, Ahmet Fatih

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

69
views

0
downloads

The majority of the difference metrics used in categorical classification algorithms do not take the dependence structure among attributes into account. Some of these metrics even make strong assumptions on attribute independence which are not realistic for many real-world datasets. In addition, these metrics do not consider attribute importance on the class variable. In this paper, a new difference metric is proposed which is named as Independently Weighted Value Difference Metric (IWVDM). IWVDM includes an embedded Incremental Feature Selection (IFS) phase. The proposed metric does not require attribute independence and it introduces a weighting procedure for attributes depending on the information that they possess on the class variable. A series of experiments is conducted using 30 UCI benchmark datasets for comparing the efficiency of IWVDM with Overlap Metric (OM), Value Difference Metric (VDM) and Frequency Difference Metric (FDM). Experimental results show the superiority of IWVDM over these three metrics.

Subject Keywords

Categorical classification, Independently weighted value difference metric, Incremental feature selection, Attribute weighting, Attribute independence

URI

https://hdl.handle.net/11511/63408

Journal

PATTERN RECOGNITION LETTERS

DOI

https://doi.org/10.1016/j.patrec.2017.07.009

Collections

Department of Statistics, Article

Suggestions

OpenMETU
Core

Classification models based on Tanaka's fuzzy linear regression approach: The case of customer satisfaction modeling ŞİKKELİ, GİZEM; KÖKSAL, GÜLSER; Batmaz, İnci; TÜRKER BAYRAK, ÖZLEM (IOS Press, 2010-01-01) Fuzzy linear regression (FLR) approaches are widely used for modeling relations between variables that involve human judgments, qualitative and imprecise data. Tanaka's FLR analysis is the first one developed and widely used for this purpose. However, this method is not appropriate for classification problems, because it can only handle continuous type dependent variables rather than categorical. In this study, we propose three alternative approaches for building classification models, for a customer satisf...
Identifying (Quasi) Equally Informative Subsets in Feature Selection Problems for Classification: A Max-Relevance Min-Redundancy Approach Karakaya, Gülşah; AHİPAŞAOĞLU, Selin Damla; TAORMİNA, Riccardo (2016-06-01) An emerging trend in feature selection is the development of two-objective algorithms that analyze the tradeoff between the number of features and the classification performance of the model built with these features. Since these two objectives are conflicting, a typical result stands in a set of Pareto-efficient subsets, each having a different cardinality and a corresponding discriminating power. However, this approach overlooks the fact that, for a given cardinality, there can be several subsets with sim...
Linear contrasts in experimental design with non-identical error distributions Senoglu, B; Tiku, ML (Wiley, 2002-01-01) Estimation of linear contrasts in experimental design, and testing their assumed values, is considered when the error distributions from block to block are not necessarily identical. The normal-theory solutions are shown to have low efficiencies as compared to the solutions presented here.
Multiple linear regression model with stochastic design variables İslam, Muhammed Qamarul (Informa UK Limited, 2010-01-01) In a simple multiple linear regression model, the design variables have traditionally been assumed to be non-stochastic. In numerous real-life situations, however, they are stochastic and non-normal. Estimators of parameters applicable to such situations are developed. It is shown that these estimators are efficient and robust. A real-life example is given.
Minimum variance quadratic unbiased estimation for the variance components in simple linear regression with onefold nested error Gueven, Ilgehan (Informa UK Limited, 2006-01-01) The explicit forms of the minimum variance quadratic unbiased estimators (MIVQUEs) of the variance components are given for simple linear regression with onefold nested error. The resulting estimators are more efficient as the ratio of the initial variance components estimates increases and are asymptotically efficient as the ratio tends to infinity.

Citation Formats

A. F. Ortakaya, “Independently weighted value difference metric,” PATTERN RECOGNITION LETTERS, pp. 61–68, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/63408.