Density-Based and parameterless clustering of embedded data streams

Download
2021-9-09
Poyraz, Özlem
With the accelerating digitalization of the world, the amount of high-speed data produced increases rapidly, and it is difficult to record and collectively process such a data-stream. This creates the need for processing as soon as it arrives without recording the data stream. Mostly, there is no prior information about data. Additionally, characteristics of data streams may change over time; this phenomenon is called concept drift. Since clustering works without actual labels, it is suitable to be used on data streams. Clustering algorithms for data streams should read the data only once, work in real-time, and adapt to the concept drift. With Density-Based and Parameterless Clustering of Embedded Data Streams (DBPCES) algorithm developed in this study, data streams are embedded into two dimensions and clustered with a parameterless density-based clustering algorithm. To embed the data stream into 2-dimensions, UMAP algorithm was adapted to handle data streams and concept drift. For clustering, DBSCAN algorithm was used on embedded data points. DBSCAN parameters were estimated with a heuristic so that data stream can be clustered without requiring any data-dependent parameters from the user. DBPCES algorithm was run on synthetic and real data streams that differ in actual cluster count, dimension count, and concept drift rate. The performance of DBPCES was compared with DenStream and implementation of Zubaroğlu and Atalay. As evaluation metrics, adjusted rand index, purity, and silhouette coefficient were used. Additionally, execution times were compared as well. Although DBPCES was not as fast as DenStream, it achieved similar results with other algorithms.

Suggestions

Estimation of depth fields suitable for video compression based on 3-D structure and motion of objects
Alatan, Abdullah Aydın (Institute of Electrical and Electronics Engineers (IEEE), 1998-6)
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences-two-dimensional (2-D) motion field-between the frames and segmentation of the scene into objects are achieved simultaneous...
Bilgi toplumu teknolojileri için anten sistemleri ve algılayıcılar
Cihangir, Aykut; Akın, Tayfun; Hızal, Altunkan; Demir, Şimşek; Güçlü, Caner; Alatan, Lale; Topallı, Nihan; Aydın Çivi, Hatice Özlem(2010)
Bu proje kapsamında, özellikle milimetre-dalga frekanslarında çalışan yeniden şekillendirilebilir anten, elektronik taramalı dizi anten ve yansıtıcı dizi anten tasarımı, üretimi ve ölçümleri yapılmıştır. Yeniden şekillendirilebilirlik özelliği için farklı teknolojiler kullanılmıştır. Huzmesi yönlendirilebilen sur biçimli mikroşerit yürüyen dalga anten dizisi X-bant uygulamalarında kullanılmak üzere tasarlamış, üretilmiş ve ölçülmüştür. Antenin ana huzmesinin istenilen yöne elek...
Optimal streaming of rate adaptable video
Gürses, Eren; Akar, Gözde; Department of Electrical and Electronics Engineering (2006)
In this study, we study the dynamics of network adaptive video streaming and propose novel algorithms for rate distortion control in video streaming. While doing so, we maintain inter-protocol fairness with TCP (Transmission Control Protocol) that is the dominant transport protocol in the current Internet. The proposed algorithms are retransmission-based and necessitate the use of playback buffers in order to tolerate the extra latency introduced by retransmissions. In the first part, we propose a practical...
GELECEĞİN KURULUŞLARI İÇİN BÜYÜK VERİ MEVCUT DURUM VE EĞİLİMLER
Kayabay, Kerem; Gökalp, Mert Onuralp; Eren, Pekin Erhan; Koçyiğit, Altan (null; 2016-10-06)
Exponential growth in data volume originating from Internet of Thingssources and information services drives the industry to develop new models and distributed tools to handle big data. In order to achieve strategic advantages, effective use of these tools and integrating results to their business processes are critical for enterprises. While there is an abundance of tools available in the market, they are underutilized by organizations due to their complexities. Deployment and usage of big data analysis ...
Kriging regression of PIV data using a local error estimate
de Baar, Jouke H. S.; Perçin, Mustafa; Dwight, Richard P.; van Oudheusden, Bas W.; Bijl, Hester (2014-01-01)
The objective of the method described in this work is to provide an improved reconstruction of an original flow field from experimental velocity data obtained with particle image velocimetry (PIV) technique, by incorporating the local accuracy of the PIV data. The postprocessing method we propose is Kriging regression using a local error estimate (Kriging LE). In Kriging LE, each velocity vector must be accompanied by an estimated measurement uncertainty. The performance of Kriging LE is first tested on syn...
Citation Formats
Ö. Poyraz, “Density-Based and parameterless clustering of embedded data streams,” M.S. - Master of Science, Middle East Technical University, 2021.