Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Online embedding and clustering of evolving data streams
Date
2022-07-01
Author
Zubaroglu, Alaettin
Atalay, Mehmet Volkan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
180
views
0
downloads
Cite This
Number of connected devices is steadily increasing and this trend is expected to continue in the near future. Connected devices continuously generate data streams and the data streams may often be high dimensional and contain concept drift. Clustering is one of the most suitable methods for real-time data stream processing, since clustering can be applied with less prior information about the data. Also, data embedding makes the visualization of high dimensional data possible and may simplify clustering process. There exist several data stream clustering algorithms in the literature; however, no data stream embedding method exists. Uniform Manifold Approximation and Projection (UMAP) is a data embedding algorithm that is suitable to be applied on stationary (stable) data streams, though it cannot adapt concept drift. In this study, we describe a novel method EmCStream, to apply UMAP on evolving (nonstationary) data streams, to detect and adapt concept drift and to cluster embedded data instances using a distance or partitioning-based clustering algorithm. We have evaluated EmCStream against the state-of-the-art stream clustering algorithms using both synthetic and real data streams containing concept drift. EmCStream outperforms DenStream and CluStream, in terms of clustering quality, on both synthetic and real evolving data streams.
Subject Keywords
data streams
,
drift adaptation
,
drift detection
,
evolving data streams
,
stream clustering
URI
https://hdl.handle.net/11511/100171
Journal
STATISTICAL ANALYSIS AND DATA MINING
DOI
https://doi.org/10.1002/sam.11590
Collections
Department of Computer Engineering, Article
Suggestions
OpenMETU
Core
Online Embedding and Clustering of Evolving Data Streams
Zubaroğlu, Alaettin; Atalay, Mehmet Volkan; Department of Computer Engineering (2023-1-18)
Number of connected devices is steadily increasing and this trend is expected to continue in the near future. Connected devices continuously generate data streams and the data streams may often be high dimensional and contain concept drift. Real-time processing of data streams is arousing interest despite many challenges. When limited information is available about the data and its labels, unsupervised learning and particularly clustering becomes an important method of analysis. However, most clustering a...
Online embedding and clustering of data streams
Zubaroǧlu, Alaettin; Atalay, Mehmet Volkan (2019-11-20)
© 2019 Association for Computing Machinery.Number of connected devices is steadily increasing and these devices continuously generate data streams. These data streams are often high dimensional and contain concept drift. Real-time processing of data streams is arousing interest despite many challenges. Clustering is a method that does not need labeled instances (it is unsupervised) and it can be applied with less prior information about the data. These properties make clustering one of the most suitable met...
Explainable Security in SDN-Based IoT Networks
Sarica, Alper Kaan; Angın, Pelin (2020-12-01)
The significant advances in wireless networks in the past decade have made a variety of Internet of Things (IoT) use cases possible, greatly facilitating many operations in our daily lives. IoT is only expected to grow with 5G and beyond networks, which will primarily rely on software-defined networking (SDN) and network functions virtualization for achieving the promised quality of service. The prevalence of IoT and the large attack surface that it has created calls for SDN-based intelligent security solut...
Green Femtocells in the IoT Era: Traffic Modeling and Challenges - An Overview
Al-Turjman, Fadi; Ever, Enver; Zahmatkesh, Hadi (2017-11-01)
The rapid increase in numbers of communicating devices, such as smartphones, PDAs, and notebooks, is causing the demand for mobile data traffic to grow significantly. In recent years, mobile operators have been trying to find solutions to increase the network capacity in order to satisfy mobile users' requests and meet the requirements in terms of various quality of service measures in the case of high mobile data traffic. With ever increasing demand from mobile users and implementations in the area of IoT,...
Joint Virtual Machine Embedding and Wireless Data Center Topology Management
Bütün, Beyza; Onur, Ertan; Department of Computer Engineering (2022-5-10)
With emerging technologies such as the Internet of Things and 5G, generated data grows enormously. Hence, Data Center Networks (DCNs) have an important duty to store and process a significant amount of data, which makes them a critical component of the network. To meet the massive amount of traffic demands, wired DCNs need to deploy large numbers of servers and power-hungry switches, and huge lengths of wires. An enormous increase in the usage of cables causes high cabling complexity and cost while deployin...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. Zubaroglu and M. V. Atalay, “Online embedding and clustering of evolving data streams,”
STATISTICAL ANALYSIS AND DATA MINING
, pp. 0–0, 2022, Accessed: 00, 2022. [Online]. Available: https://hdl.handle.net/11511/100171.