Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
A Framework to Detect Disguised Missing Data
Date
2011-01-01
Author
Belen, Rahime
Taşkaya Temizel, Tuğba
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
122
views
0
downloads
Cite This
Many manually populated very large databases suffer from data quality problems such as missing, inaccurate data and duplicate entries. A recently recognized data quality problem is that of disguised missing data which arises when an explicit code for missing data such as NA (Not Available) is not provided and a legitimate data value is used instead. Presence of these values may affect the outcome of data mining tasks severely such that association mining algorithms or clustering techniques may result in biased inaccurate association rules and invalid clusters respectively. Detection and elimination of these values are necessary but burdensome to be carried out manually. In this chapter, the methods to detect disguised missing values by visual inspection are explained first. Then, the authors describe the methods used to detect these values automatically. Finally, the framework to detect disguised missing data is proposed and a demonstration of the framework on spatial and categorical data sets is provided.
URI
https://hdl.handle.net/11511/70046
Journal
KNOWLEDGE DISCOVERY PRACTICES AND EMERGING APPLICATIONS OF DATA MINING: TRENDS AND NEW DOMAINS
DOI
https://doi.org/10.4018/978-1-60960-067-9.ch001
Collections
Graduate School of Informatics, Article
Suggestions
OpenMETU
Core
An ilp-based concept discovery system for multi-relational data mining
Kavurucu, Yusuf; Karagöz, Pınar; Department of Computer Engineering (2009)
Multi Relational Data Mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. However, as patterns involve multiple relations, the search space of possible hypothesis becomes intractably complex. In order to cope with this problem, several relational knowledge discovery systems have been developed employing various search strategies, heuristics and language pattern limitations. In this thesis, Induct...
An Improved graph mining tool and its application to object detection in remote sensing
Aktaş, Ümit Ruşen; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2013)
In many graph-based data mining tools, the use of numeric values as attributes in graphs is very limited. Most algorithms require pre-processing of the attributes, which often involves discretization into bins and embedding group names in the input graph(s). In this thesis, we tackle this problem by utilizing all attributes as is, and directly incorporating them into the pattern mining process. In order to implement our method, we modify an existing graph-based knowledge discovery algorithm, SUBDUE, by addi...
A Content-Boosted Collaborative Filtering Approach for Movie Recommendation Based on Local and Global Similarity and Missing Data Prediction
Özbal, Gozde; Karaman, Hilal; Alpaslan, Ferda Nur (Oxford University Press (OUP), 2011-09-01)
Most traditional recommender systems lack accuracy in the case where data used in the recommendation process is sparse. This study addresses the sparsity problem and aims to get rid of it by means of a content-boosted collaborative filtering approach applied to a web-based movie recommendation system. The main motivation is to investigate whether further success can be obtained by combining 'local and global user similarity' and 'effective missing data prediction' approaches, which were previously introduce...
A Methodology to develop process ontology from organizational guidelines written in natural language
Gürbüz, Özge; Demirörs, Onur; Department of Information Systems (2017)
Integrating ontologies with process modeling improves data representations and makes it easier to query, store and reuse processes at the semantics level. Therefore, in recent years, this topic has become increasingly popular. The studies in the literature have proposed methods for the integration process either to relate domain ontologies to process models or to transform process models to process ontologies. Another way to establish the integration between ontologies and process models is to develop proce...
A Content Boosted Collaborative Filtering Approach for Movie Recommendation Based on Local Global Similarity and Missing Data Prediction
Özbal, Gözde; Kahraman, Hilal; Alpaslan, Ferda Nur (2010-09-22)
Many recommender systems lack in accuracy when the data used throughout the recommendation process is sparse. Our study addresses this limitation by means of a content boosted collaborative filtering approach applied to the task of movie recommendation. We combine two different approaches previously proved to be successful individually and improve over them by processing the content information of movies, as confirmed by our empirical evaluation results.
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
R. Belen and T. Taşkaya Temizel, “A Framework to Detect Disguised Missing Data,”
KNOWLEDGE DISCOVERY PRACTICES AND EMERGING APPLICATIONS OF DATA MINING: TRENDS AND NEW DOMAINS
, pp. 1–22, 2011, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/70046.