Hide/Show Apps

Cosine Similarity-Based Pruning for Concept Discovery

In this work we focus on improving the time efficiency of Inductive Logic Programming (ILP)-based concept discovery systems. Such systems have scalability issues mainly due to the evaluation of large search spaces. Evaluation of the search space cosists translating candidate concept descriptor into SQL queries, which involve a number of equijoins on several tables, and running them against the dataset. We aim to improve time efficiency of such systems by reducing the number of queries executed on a DBMS. To this aim, we utilize cosine similarity to measure the similarity of arguments that go through equijoins and prune those with 0 similarity. The proposed method is implemented as an extension to an existing ILP-based concept discovery system called Tabular Cris w-EF and experimental results show that the poposed method reduces the number of queries executed around 15 %.