Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance

Date

2020-06-01

Author

DİNLER, DERYA
Tural, Mustafa Kemal
Özdemirel, Nur Evin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

167
views

0
downloads

We consider a clustering problem in which the data objects are rooted m-ary trees with known node correspondence. We assume that the nodes of the trees are unweighted, but the edges can be unweighted or weighted. We measure the similarity and distance between two trees using vertex/edge overlap (VEO) and graph edit distance (GED), respectively. For both measures, we first study the problem of finding a centroid tree of a given cluster of trees in both the unweighted and weighted edge cases. We compute the optimal centroid tree of a given cluster for all measures except the weighted VEO for which a heuristic is developed. We then propose k-means based algorithms that repeat cluster assignment and centroid update steps until convergence. The initial centroid trees are constructed based on the properties of the data. The assignment steps utilize unweighted or weighted versions of VEO or GED to assign each tree to the most similar centroid tree. In the update steps, each centroid tree is updated by considering the trees assigned to it. The proposed algorithms are compared with the traditional k-modes and k-means on randomly generated datasets and shown to be more effective and robust (to outliers) in separating trees into clusters. We also apply our algorithms on a real world brain artery data and show that the previously observed age and sex effects on brain artery structures can be revealed better by means of clustering with our algorithms than the traditional k-modes and k-means.

Subject Keywords

Management Science and Operations Research, General Decision Sciences

URI

https://hdl.handle.net/11511/42792

Journal

ANNALS OF OPERATIONS RESEARCH

DOI

https://doi.org/10.1007/s10479-019-03505-7

Collections

Department of Industrial Engineering, Article

Suggestions

OpenMETU
Core

An approximation for kanban controlled assembly systems TOPAN, Engin; Avşar, Zeynep Müge (Springer Science and Business Media LLC, 2011-01-01) An approximation is proposed to evaluate the steady-state performance of kanban controlled two-stage assembly systems. The development of the approximation is as follows. The considered continuous-time Markov chain is aggregated keeping the model exact, and this aggregate model is approximated replacing some state-dependent transition rates with constant rates. The approximate aggregate model is, then, decomposed into submodels and a product-form steady-state distribution is obtained for each submodel. Fina...
Anti-periodic solutions for state-dependent impulsive recurrent neural networks with time-varying and continuously distributed delays Sayli, Mustafa; YILMAZ, ENES (Springer Science and Business Media LLC, 2017-11-01) In this paper, we address a new model of neural networks related to the impulsive phenomena which is called state-dependent impulsive recurrent neural networks with time-varying and continuously distributed delays. We investigate sufficient conditions on the existence and uniqueness of exponentially stable anti-periodic solution for these neural networks by employing method of coincide degree theory and an appropriate Lyapunov function. Moreover, we present an illustrative example to show the effectiveness ...
An interactive algorithm to find the most preferred solution of multi-objective integer programs LOKMAN, BANU; Köksalan, Mustafa Murat; Korhonen, Pekka J.; Wallenius, Jyrki (Springer Science and Business Media LLC, 2016-10-01) In this paper, we develop an interactive algorithm that finds the most preferred solution of a decision maker (DM) for multi-objective integer programming problems. We assume that the DM's preferences are consistent with a quasiconcave value function unknown to us. Based on the properties of quasiconcave value functions and pairwise preference information obtained from the DM, we generate constraints to restrict the implied inferior regions. The algorithm continues iteratively and guarantees to find the mos...
Neural network calibrated stochastic processes: forecasting financial assets Giebel, Stefan; Rainer, Martin (Springer Science and Business Media LLC, 2013-03-01) If a given dynamical process contains an inherently unpredictable component, it may be modeled as a stochastic process. Typical examples from financial markets are the dynamics of prices (e.g. prices of stocks or commodities) or fundamental rates (exchange rates etc.). The unknown future value of the corresponding stochastic process is usually estimated as the expected value under a suitable measure, which may be determined from distribution of past (historical) values. The predictive power of this estimati...
Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market Savku, E.; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-08-01) We apply dynamic programming principle to discuss two optimal investment problems by using zero-sum and nonzero-sum stochastic game approaches in a continuous-time Markov regime-switching environment within the frame work of behavioral finance. We represent different states of an economy and, consequently, investors' floating levels of psychological reactions by aD-state Markov chain. The first application is a zero-sum game between an investor and the market, and the second one formulates a nonzero-sum sto...

Citation Formats

D. DİNLER, M. K. Tural, and N. E. Özdemirel, “Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance,” ANNALS OF OPERATIONS RESEARCH, pp. 85–122, 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/42792.