Cosine Similarity-Based Pruning for Concept Discovery

Creative Commons License


31st International Symposium on Computer and Information Sciences (ISCIS), Krakow, Poland, 27 - 28 October 2016, vol.659, pp.90-96 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 659
  • Doi Number: 10.1007/978-3-319-47217-1_10
  • City: Krakow
  • Country: Poland
  • Page Numbers: pp.90-96
  • Middle East Technical University Affiliated: Yes


In this work we focus on improving the time efficiency of Inductive Logic Programming (ILP)-based concept discovery systems. Such systems have scalability issues mainly due to the evaluation of large search spaces. Evaluation of the search space cosists translating candidate concept descriptor into SQL queries, which involve a number of equijoins on several tables, and running them against the dataset. We aim to improve time efficiency of such systems by reducing the number of queries executed on a DBMS. To this aim, we utilize cosine similarity to measure the similarity of arguments that go through equijoins and prune those with 0 similarity. The proposed method is implemented as an extension to an existing ILP-based concept discovery system called Tabular Cris w-EF and experimental results show that the poposed method reduces the number of queries executed around 15 %.