Squeezing the ensemble pruning: Faster and more accurate categorization for news portals


Toraman Ç., Can F.

34th European Conference on Information Retrieval, ECIR 2012, Barcelona, İspanya, 1 - 05 Nisan 2012, cilt.7224 LNCS, ss.508-511 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 7224 LNCS
  • Doi Numarası: 10.1007/978-3-642-28997-2_52
  • Basıldığı Şehir: Barcelona
  • Basıldığı Ülke: İspanya
  • Sayfa Sayıları: ss.508-511
  • Anahtar Kelimeler: Ensemble pruning, news portal, text categorization
  • Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

Recent studies show that ensemble pruning works as effective as traditional ensemble of classifiers (EoC). In this study, we analyze how ensemble pruning can improve text categorization efficiency in time-critical real-life applications such as news portals. The most crucial two phases of text categorization are training classifiers and assigning labels to new documents; but the latter is more important for efficiency of such applications. We conduct experiments on ensemble pruning-based news article categorization to measure its accuracy and time cost. The results show that our heuristics reduce the time cost of the second phase. Also we can make a trade-off between accuracy and time cost to improve both of them with appropriate pruning degrees. © 2012 Springer-Verlag Berlin Heidelberg.