Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Developing novel methods on data stream classification and clustering for accuracy improvement
Download
Doktora_Tez_2024_emaden_teslim_edilen.pdf
Date
2024-9
Author
Maden, Engin
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
112
views
26
downloads
Cite This
The streaming data from different sources as social media, telecommunication network or credit card processing are accumulated and growing enormously. Thus, it has become more important to produce valuable information from such big data environments. There are specific characteristics of data streams such as continuous flow, high volume, rapid arrival and change of distribution. Due to these characteristics, there are limitations for processing data streams such as limited resource and time and the data can be scanned only once. At this point data stream mining emerges with the streaming version of traditional data mining operations such as clustering and classification. In this study, data stream classification and short text stream clustering as a specific area of data stream clustering are worked on. Enhancements and novel methods are proposed and their performances are compared with the state of the art methods. For data stream classification, our proposed methods are named as m-kNN (Mean Extended kNN) and CSWB (Combined Sliding Window Based) classifier which is a combination of m-kNN and MC-NN (Micro Cluster Nearest Neighbour). Two new versions of CSWB are also presented, CSWB-e and CSWB-e2, such that our m-kNN classifier is combined with K* (K-Star) and C4.5, and with K* (K-Star) and Naive Bayes, respectively. For the short text stream clustering, a method named T-GSC (A Two Level Graph Based Short Text Stream Clusterer) is proposed. A survey is also prepared about the current methods in short text stream clustering and classified them with respect to their clustering approaches.
Subject Keywords
Data stream classification
,
Sliding window
,
Hybrid classifier
,
Short text stream clustering
,
Word relation network
URI
https://hdl.handle.net/11511/111446
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. Maden, “Developing novel methods on data stream classification and clustering for accuracy improvement,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.