A clustering method for web data with multi-type interrelated components

2007-05-08
Bolelli, Levent
Ertekin Bolelli, Şeyda
Zhou, Ding
Giles, C Lee
Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features. Many real world data, however, is heterogeneous in nature, comprising of multiple types of interrelated components. We present a clustering algorithm, K-SVMeans, that integrates the well known K-Means clustering with the highly popular Support Vector Machines(SVM) in order to utilize the richness of data. Our experimental results on authorship analysis of scientific publications show that K-SVMeans achieves better clustering performance than homogeneous data clustering.

Suggestions

A methodology of swarm intelligence application in clustering based on neighborhood construction
İnkaya, Tülin; Kayalıgil, Sinan; Özdemirel, Nur Evin; Department of Industrial Engineering (2011)
In this dissertation, we consider the clustering problem in data sets with unknown number of clusters having arbitrary shapes, intracluster and intercluster density variations. We introduce a clustering methodology which is composed of three methods that ensures extraction of local density and connectivity properties, data set reduction, and clustering. The first method constructs a unique neighborhood for each data point using the connectivity and density relations among the points based upon the graph the...
A new anisotropic perfectly matched layer medium for mesh truncation in finite difference time domain analysis
Tong, MS; Chen, YC; Kuzuoğlu, Mustafa; Mittra, R (1999-09-01)
In this paper an unsplit anisotropic perfectly matched layer (PML) medium, previously utilized in the context of finite element analysis, is implemented in the finite difference time domain (FDTD) algorithm. The FDTD anisotropic PML is easy to implement in the existing FDTD codes, and is well suited for truncating inhomogeneous and layered media without special treatment required in the conventional PML approach. A further advantage of the present approach is improved performance at lower frequencies. The a...
A Probabilistic approach to sparse multi scale phase based stereo
ULUSOY PARNAS, İLKAY; Halıcı, Uğur; HANCOCK, EDWIN (2004-04-30)
In this study, a multi-scale phase based sparse disparity algorithm and a probabilistic model for matching are proposed. The disparity algorithm and the probabilistic approach are verified on various stereo image pairs.
A Proposed Methodology for Evaluating HDR False Color Maps
Akyüz, Ahmet Oğuz (Association for Computing Machinery (ACM), 2016-08-01)
Color mapping, which involves assigning colors to the individual elements of an underlying data distribution, is a commonly used method for data visualization. Although color maps are used in many disciplines and for a variety of tasks, in this study we focus on its usage for visualizing luminance maps. Specifically, we ask ourselves the question of how to best visualize a luminance distribution encoded in a high-dynamic-range (HDR) image using false colors such that the resulting visualization is the most ...
A Graph-Based Concept Discovery Method for n-Ary Relations
Abay, Nazmiye Ceren; MUTLU, ALEV; Karagöz, Pınar (2015-09-04)
Concept discovery is a multi-relational data mining task for inducing definitions of a specific relation in terms of other relations in the data set. Such learning tasks usually have to deal with large search spaces and hence have efficiency and scalability issues. In this paper, we present a hybrid approach that combines association rule mining methods and graph-based approaches to cope with these issues. The proposed method inputs the data in relational format, converts it into a graph representation, and...
Citation Formats
L. Bolelli, Ş. Ertekin Bolelli, D. Zhou, and C. L. Giles, “A clustering method for web data with multi-type interrelated components,” 2007, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/69643.