Cosine Similarity-Based Pruning for Concept Discovery


Creative Commons License

DOĞAN A., MUTLU A., KARAGÖZ P.

31st International Symposium on Computer and Information Sciences (ISCIS), Krakow, Polonya, 27 - 28 Ekim 2016, cilt.659, ss.90-96 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 659
  • Doi Numarası: 10.1007/978-3-319-47217-1_10
  • Basıldığı Şehir: Krakow
  • Basıldığı Ülke: Polonya
  • Sayfa Sayıları: ss.90-96
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In this work we focus on improving the time efficiency of Inductive Logic Programming (ILP)-based concept discovery systems. Such systems have scalability issues mainly due to the evaluation of large search spaces. Evaluation of the search space cosists translating candidate concept descriptor into SQL queries, which involve a number of equijoins on several tables, and running them against the dataset. We aim to improve time efficiency of such systems by reducing the number of queries executed on a DBMS. To this aim, we utilize cosine similarity to measure the similarity of arguments that go through equijoins and prune those with 0 similarity. The proposed method is implemented as an extension to an existing ILP-based concept discovery system called Tabular Cris w-EF and experimental results show that the poposed method reduces the number of queries executed around 15 %.