Subtree selection in kernels for graph classification


TAN M., POLAT F., Alhajj R.

INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, vol.8, no.3, pp.294-310, 2013 (Peer-Reviewed Journal) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 8 Issue: 3
  • Publication Date: 2013
  • Doi Number: 10.1504/ijdmb.2013.056080
  • Journal Name: INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS
  • Journal Indexes: Science Citation Index Expanded, Scopus
  • Page Numbers: pp.294-310

Abstract

Classification of structured data is essential for a wide range of problems in bioinformatics and cheminformatics. One such problem is in silico prediction of small molecule properties such as toxicity, mutagenicity and activity. In this paper, we propose a new feature selection method for graph kernels that uses the subtrees of graphs as their feature sets. A masking procedure which boils down to feature selection is proposed for this purpose. Experiments conducted on several data sets as well as a comparison of our method with some frequent subgraph based approaches are presented.