Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison


Soysal M., SCHMİDT Ş. E.

PERFORMANCE EVALUATION, cilt.67, sa.6, ss.451-467, 2010 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 67 Sayı: 6
  • Basım Tarihi: 2010
  • Doi Numarası: 10.1016/j.peva.2010.01.001
  • Dergi Adı: PERFORMANCE EVALUATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.451-467
  • Anahtar Kelimeler: Traffic classification, Privacy-preserving classification, Supervised machine learning, Data set composition, Comparison
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

The task of network management and monitoring relies on an accurate characterization of network traffic generated by different applications and network protocols. We employ three supervised machine learning (ML) algorithms, Bayesian Networks, Decision Trees and Multilayer Perceptrons for the flow-based classification of six different types of Internet traffic including peer-to-peer (P2P) and content delivery (Akamai) traffic. The dependency of the traffic classification performance on the amount and composition of training data is investigated followed by experiments that show that ML algorithms such as Bayesian Networks and Decision Trees are suitable for Internet traffic flow classification at a high speed, and prove to be robust with respect to applications that dynamically change their source ports. Finally, the importance of correctly classified training instances is highlighted by an experiment that is conducted with wrongly labeled training data. (C) 2010 Elsevier B.V. All rights reserved.