Text Classification in the Turkish Marketing Domain for Context Sensitive Ad Distribution

Engin M., Can T.

24th International Symposium on Computer and Information Sciences, Güzelyurt, Cyprus (Kktc), 14 - 16 September 2009, pp.105-110 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/iscis.2009.5291861
  • City: Güzelyurt
  • Country: Cyprus (Kktc)
  • Page Numbers: pp.105-110
  • Keywords: Text Classification, Data Mining, Machine Learning, Artificial Intelligence, Information Retrieval, World Wide Web
  • Middle East Technical University Affiliated: Yes


In this paper, we construct and compare several feature extraction approaches in order to find a better solution for classification of Turkish web documents in the marketing domain. We produce our feature extraction techniques using characteristics of the Turkish language, structures of web documents and online content in the marketing domain. We form datasets in different feature spaces and we apply several Support Vector Machine (SVM) configurations on these datasets. We conduct our study considering the performance needs of practical context sensitive systems. Our results show that linear kernel classifiers achieve the best performance in terms of accuracy and speed on text documents expressed as keyword root features.