Data Imputation Through the Identification of Local Anomalies

2015-10
Ozkan, Huseyin
Pelvan, Ozgun Soner
Kozat, Suleyman S.
We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose: 1) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and 2) a maximum a posteriori estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous versus normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions.
IEEE Transactions on Neural Networks and Learning Systems

Suggestions

Abnormal Crowd Behavior Detection Using Novel Optical Flow-Based Features
Direkoglu, Cem; Sah, Melike; O'Connor, Noel E. (2017-09-01)
In this paper, we propose a novel optical flow based features for abnormal crowd behaviour detection. The proposed feature is mainly based on the angle difference computed between the optical flow vectors in the current frame and in the previous frame at each pixel location. The angle difference information is also combined with the optical flow magnitude to produce new, effective and direction invariant event features. A one-class SVM is utilized to learn normal crowd behavior. If a test sample deviates si...
Streaming Event Detection in Microblogs: Balancing Accuracy and Performance
SAHIN, OZLEM CEREN; Karagöz, Pınar; TATBUL, NESIME (2019-06-14)
In this work, we model the problem of online event detection in microblogs as a stateful stream processing problem and offer a novel solution that balances result accuracy and performance. Our new approach builds on two state of the art algorithms. The first algorithm is based on identifying bursty keywords inside blocks of blog messages. The second one involves clustering blog messages based on similarity of their contents. To combine the computational simplicity of the keyword-based algorithm with the sem...
Experimental design in the presence of covariates
Avcıoğlu, M Didem; Tiku, Moti Lal; Şenoğlu, Birdal; Department of Statistics (2003)
Experimental design methods have broad coverage of application areas. Usually, basic goal in experimental design methods is to compare the effect of controllable experimental factors on the response and locate the interest on the one, which is most effective. However, in many experimental situations, responses are not only affected by the controllable experimental factors, but also by uncontrollable variates, usually named as covariates. Main aim in these models is to relate the response to both the control...
3D perceptual soundfield reconstruction via sound field extrapolation
Erdem, Eg; Hacıhabiboğlui Hüseyin.; Department of Multimedia Informatics (2020)
Perceptual sound field reconstruction (PSR) is a spatial audio recording and reproduction method based on the application of stereophonic panning laws in microphone array design. PSR allows rendering a perceptually veridical and stable auditory perspective in the horizontal plane of the listener, and involves recording using nearcoincident microphone arrays. This thesis extends the two dimensional PSR concept to three dimensions and allows reconstructing an arbitrary sound field based on measurements with a...
ONLINE ANOMALY DETECTION WITH CONSTANT FALSE ALARM RATE
Ozkan, Huseyin; Ozkan, Fatih; Delibalta, Ibrahim; KOZAT, SÜLEYMAN SERDAR (2015-09-20)
We propose a computationally highly scalable online anomaly detection algorithm for time series, which achieves - with no parameter tuning- a specified false alarm rate while minimizing the miss rate. The proposed algorithm sequentially operates on a fast streaming temporal data, extracts the nominal attributes under possibly varying Markov statistics and then declares an anomaly when the observations are statistically sufficiently deviant. Regardless of whether the source is stationary or non-stationary, o...
Citation Formats
H. Ozkan, O. S. Pelvan, and S. S. Kozat, “Data Imputation Through the Identification of Local Anomalies,” IEEE Transactions on Neural Networks and Learning Systems, pp. 2381–2395, 2015, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/52340.