Reducing Features to Improve Link Prediction Performance in Location Based Social Networks, Non-Monotonically Selected Subset from Feature Clusters

2019-01-01
Bayrak, Ahmet Engin
Polat, Faruk
In most cases, feature sets available for machine learning algorithms require a feature engineering approach to pick the subset for optimal performance. During our link prediction research, we had observed the same challenge for features of Location Based Social Networks (LBSNs). We applied multiple reduction approaches to avoid performance issues caused by redundancy and relevance interactions between features. One of the approaches was the custom two-step method; starts with clustering features based on the proposed interaction related similarity measurement and ends with non-monotonically selecting optimal feature subset from those clusters. In this study, we applied well-known generic feature reduction algorithms together with our custom method for LBSNs to evaluate novelty and verify the contributions. Results from multiple data groups depict that our custom feature reduction approach makes higher and more stable effectivity optimizations for link prediction when compared with others.

Suggestions

On numerical optimization theory of infinite kernel learning
Ozogur-Akyuz, S.; Weber, Gerhard Wilhelm (2010-10-01)
In Machine Learning algorithms, one of the crucial issues is the representation of the data. As the given data source become heterogeneous and the data are large-scale, multiple kernel methods help to classify "nonlinear data". Nevertheless, the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, a novel method of "infinite" kernel combinations is proposed with the help of infinite and semi-infinite programming regarding all elements in kernel space. Look...
MODELLING OF KERNEL MACHINES BY INFINITE AND SEMI-INFINITE PROGRAMMING
Ozogur-Akyuz, S.; Weber, Gerhard Wilhelm (2009-06-03)
In Machine Learning (ML) algorithms, one of the crucial issues is the representation of the data. As the data become heterogeneous and large-scale, single kernel methods become insufficient to classify nonlinear data. The finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, we propose a novel method of "infinite" kernel combinations for learning problems with the help of infinite and semi-infinite programming regarding all elements in kernel space. Looking...
Multiobjective evolutionary feature subset selection algorithm for binary classification
Deniz Kızılöz, Firdevsi Ayça; Coşar, Ahmet; Dökeroğlu, Tansel; Department of Computer Engineering (2016)
This thesis investigates the performance of multiobjective feature subset selection (FSS) algorithms combined with the state-of-the-art machine learning techniques for binary classification problem. Recent studies try to improve the accuracy of classification by including all of the features in the dataset, neglecting to determine the best performing subset of features. However, for some problems, the number of features may reach thousands, which will cause too much computation power to be consumed during t...
Domain adaptation on graphs by learning graph topologies: theoretical analysis and an algorithm
Vural, Elif (The Scientific and Technological Research Council of Turkey, 2019-01-01)
Traditional machine learning algorithms assume that the training and test data have the same distribution, while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account the deviations in data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a source graph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating the unknown class labels on the target graph using the...
Cost-sensitive learning for rare subtype classfication of lung cancer
Kızılilsoley, Nehir; Tanıl, Ezgi; Nikerel, Emrah (Orta Doğu Teknik Üniversitesi Enformatik Enstitüsü; 2022-10)
Machine learning (ML) algorithms assume or promote that the training set is balanced among classes. For imbalanced datasets, even though the overall accuracy is high, the classical machine learning algorithms bias toward the majority class, causing the model fit poorly to the minority class [1,2] which hinders the use of these algorithms for classification of rare events. Strategies to overcome this problem including altering the training data directly to reduce the difference between classes or changing th...
Citation Formats
A. E. Bayrak and F. Polat, “Reducing Features to Improve Link Prediction Performance in Location Based Social Networks, Non-Monotonically Selected Subset from Feature Clusters,” 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/40730.