An ilp-based concept discovery system for multi-relational data mining

Download
2009
Kavurucu, Yusuf
Multi Relational Data Mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. However, as patterns involve multiple relations, the search space of possible hypothesis becomes intractably complex. In order to cope with this problem, several relational knowledge discovery systems have been developed employing various search strategies, heuristics and language pattern limitations. In this thesis, Inductive Logic Programming (ILP) based concept discovery is studied and two systems based on a hybrid methodology employing ILP and APRIORI, namely Confidence-based Concept Discovery and Concept Rule Induction System, are proposed. In Confidence-based Concept Discovery and Concept Rule Induction System, the main aim is to relax the strong declarative biases and user-defined specifications. Moreover, this new method directly works on relational databases. In addition to this, the traditional definition of confidence from relational database perspective is modified to express Closed World Assumption in first-order logic. A new confidence-based pruning method based on the improved definition is applied in the APRIORI lattice. Moreover, a new hypothesis evaluation criterion is used for expressing the quality of patterns in the search space. In addition to this, in Concept Rule Induction System, the constructed rule quality is further improved by using an improved generalization metod. Finally, a set of experiments are conducted on real-world problems to evaluate the performance of the proposed method with similar systems in terms of support and confidence.

Suggestions

Confidence-based concept discovery in relational databases
Kavurucu, Yusuf; Karagöz, Pınar; Toroslu, İsmail Hakkı (2009-11-16)
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate high-quality patterns. In this work, we improve an ILP-based con...
Improving the scalability of ILP-based multi-relational concept discovery system through parallelization
Mutlu, Ayşe Ceyda; Karagöz, Pınar; Kavurucu, Yusuf (2012-03-01)
Due to the increase in the amount of relational data that is being collected and the limitations of propositional problem definition in relational domains, multi-relational data mining has arisen to be able to extract patterns from relational data. In order to cope with intractably large search space and still to be able to generate high-quality patterns. ILP-based multi-relational data mining and concept discovery systems employ several search strategies and pattern limitations. Another direction to cope w...
ILP-based concept discovery in multi-relational data mining
Kavurucu, Yusuf; Karagöz, Pınar; Toroslu, İsmail Hakkı (Elsevier BV, 2009-11-01)
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate high-quality patterns. In this work, an ILP-based concept discov...
A new hybrid multi-relational data mining technique
Toprak, Seda Dağlar; Toroslu, İ. Hakkı; Department of Computer Engineering (2005)
Multi-relational learning has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. As patterns involve multiple relations, the search space of possible hypotheses becomes intractably complex. Many relational knowledge discovery systems have been developed employing various search strategies, search heuristics and pattern language limitations in order to cope with the complexity of hypothesis space. In this w...
A New WAP-tree based sequential pattern mining algorithm for faster pattern extraction
Önal, Kezban Dilek; Şenkul, Pınar; Department of Computer Engineering (2012)
Sequential pattern mining constitutes a basis for solution of problems in various domains like bio-informatics and web usage mining. Research on this field continues seeking faster algorithms. WAP-Tree based algorithms that emerged from web usage mining literature have shown a remarkable performance on single-item sequence databases. In this study, we investigated application of WAP-Tree based mining to multi-item sequential pattern mining and we designed an extension of WAP-Tree data structure for multi-it...
Citation Formats
Y. Kavurucu, “An ilp-based concept discovery system for multi-relational data mining,” Ph.D. - Doctoral Program, Middle East Technical University, 2009.