Improving the scalability of ILP-based multi-relational concept discovery system through parallelization


MUTLU A. C., Senkul P., Kavurucu Y.

KNOWLEDGE-BASED SYSTEMS, vol.27, pp.352-368, 2012 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 27
  • Publication Date: 2012
  • Doi Number: 10.1016/j.knosys.2011.11.001
  • Journal Name: KNOWLEDGE-BASED SYSTEMS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.352-368
  • Middle East Technical University Affiliated: Yes

Abstract

Due to the increase in the amount of relational data that is being collected and the limitations of propositional problem definition in relational domains, multi-relational data mining has arisen to be able to extract patterns from relational data. In order to cope with intractably large search space and still to be able to generate high-quality patterns. ILP-based multi-relational data mining and concept discovery systems employ several search strategies and pattern limitations. Another direction to cope with the large search space is using parallelization. By parallel data mining, improvement in time efficiency and scalability can be provided without further limiting the language patterns. In this work, we describe a method for concept discovery with parallelization on an ILP-based concept discovery system. The non-parallel algorithm, namely Concept Rule Induction System (CRIS), is modified in such a way that the parts that involve high amount of query processing, which causes bottleneck, are reorganized in a data parallel way. The resulting algorithm is called, Parallel CRIS (pCRIS). A set of experiments is conducted in order to evaluate the performance of the proposed method. (C) 2011 Elsevier B.V. All rights reserved.