ROLEX-SP: Rules of lexical syntactic patterns for free text categorization


Al Zamil M. G. H., Can A.

KNOWLEDGE-BASED SYSTEMS, cilt.24, ss.58-65, 2011 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 24
  • Basım Tarihi: 2011
  • Doi Numarası: 10.1016/j.knosys.2010.07.005
  • Dergi Adı: KNOWLEDGE-BASED SYSTEMS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.58-65
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Due to the rapid growth of free text documents available in digital form, efficient techniques of automatic categorization are of great importance. In this paper, we present an efficient rule-based method for categorizing free text documents. The contributions of this research are the formation of lexical syntactic patterns as basic classification features, a categorization framework that addresses the problem of classifying free text with minimal label description, and an efficient learning algorithm in terms of time complexity and F-measure. The framework of ROLEX-SP concentrates on capturing the correct classes of text as well as reducing classification errors.