Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
K-SVMeans: A hybrid clustering algorithm for multi-type interrelated datasets
Date
2007-01-01
Author
Bolelli, Levent
Ertekin Bolelli, Şeyda
Zhou, Ding
Giles, C. Lee
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
124
views
0
downloads
Cite This
Identification of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by homogeneous and uniform features. Many real-world data, on the other hand, comprise of multiple types of heterogeneous interrelated components, such as web pages and hyperlinks, online scientific publications and authors and publication venues to name a few. In this paper, we present K-SVMeans, a clustering algorithm for multi-type interrelated datasets that integrates the well known K-Means clustering with the highly popular Support Vector Machines. The experimental results on authorship analysis of two real world web-based datasets show that K-SVMeans can successfully discover topical clusters of documents and achieve better clustering solutions than homogeneous data clustering.
Subject Keywords
Clustering algorithms
,
Support vector machines
,
Web pages
,
Web sites
,
Support vector machine classification
,
Supervised learning
,
Computer science
,
Data engineering
,
Educational institutions
,
Information services
URI
https://hdl.handle.net/11511/45595
DOI
https://doi.org/10.1109/wi.2007.61
Collections
Department of Computer Engineering, Conference / Seminar
Suggestions
OpenMETU
Core
K-way partitioning of signed bipartite graphs
Ömeroğlu, Nurettin Burak; Toroslu, İsmail Hakkı; Department of Computer Engineering (2012)
Clustering is the process in which data is differentiated, classified according to some criteria. As a result of partitioning process, data is grouped into clusters for specific purpose. In a social network, clustering of people is one of the most popular problems. Therefore, we mainly concentrated on finding an efficient algorithm for this problem. In our study, data is made up of two types of entities (e.g., people, groups vs. political issues, religious beliefs) and distinct from most previous works, sig...
An access structure for similarity-based fuzzy databases
Yazıcı, Adnan (Elsevier BV, 1999-04-01)
A significant effort has been made in representing imprecise information in database models by using fuzzy set theory. However, the research directed toward access structures to handle fuzzy querying effectively is still at an immature stage. Fuzzy querying involves more complex processing than the ordinary querying does. Additionally, a larger number of tuples are possibly selected by fuzzy conditions in comparison to the crisp ones. It is obvious that the need for fast response time becomes very important...
Cluster based model diagnostic for logistic regression
Tanju, Özge; Kalaylıoğlu Akyıldız, Zeynep Işıl; Department of Statistics (2016)
Model selection methods are commonly used to identify the best approximation that explains the data. Existing methods are generally based on the information theory, such as Akaike Information Criterion (AIC), corrected Akaike Information Criterion (AICc), Consistent Akaike Information Criterion (CAIC), and Bayesian Information Criterion (BIC). These criteria do not depend on any modeling purposes. In this thesis, we propose a new method for logistic regression model selection where the modeling purpose is c...
Multisource region attention network for fine-grained object recognition in remote sensing imagery
Sümbül, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (Institute of Electrical and Electronics Engineers (IEEE), 2019-07)
Fine-grained object recognition concerns the identification of the type of an object among a large number of closely related subcategories. Multisource data analysis that aims to leverage the complementary spectral, spatial, and structural information embedded in different sources is a promising direction toward solving the fine-grained recognition problem that involves low between-class variance, small training set sizes for rare classes, and class imbalance. However, the common assumption of coregistered ...
A clustering method for web data with multi-type interrelated components
Bolelli, Levent; Ertekin Bolelli, Şeyda; Zhou, Ding; Giles, C Lee (2007-05-08)
Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features. Many real world data, however, is heterogeneous in nature, comprising of multiple types of interrelated components. We present a clustering algorithm, K-SVMeans, that integrates the well known K-Means clustering with the highly popular Support Vector Machines(SVM) in order to utilize the richness of data. Our experimental results on author...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
L. Bolelli, Ş. Ertekin Bolelli, D. Zhou, and C. L. Giles, “K-SVMeans: A hybrid clustering algorithm for multi-type interrelated datasets,” 2007, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/45595.