Software Module Classification for Commercial Bug Reports


Ozturk C. E., Yilmaz E. H., Koksal O., Koc A.

2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023, Rhodes Island, Yunanistan, 4 - 10 Haziran 2023, (Tam Metin Bildiri) identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icasspw59220.2023.10193706
  • Basıldığı Şehir: Rhodes Island
  • Basıldığı Ülke: Yunanistan
  • Anahtar Kelimeler: bug triaging, machine learning, natural language processing, software bug report classification, software engineering
  • Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

In this work, we curate and investigate a dataset named Turkish Software Report - Module Classification (TSRMC), consisting of commercial software bug reports of a company. Automated bug classification is required in large-scale software projects due to the vast amount of bugs. We analyze and report the statistical features and classification difficulty of the dataset. We use several methods from the text classification literature to assign each bug report of the TSRMC dataset a suitable software module. The utilized methods include traditional machine learning (ML) methods, such as support vector machine (SVM) and logistic regression; sequential deep learning (DL) models, such as gated recurrent unit (GRU) and convolutional neural networks (CNN); and Bidirectional Encoder Representations from Transformers (BERT)-based pre-trained language models (PLMs). Our work is one of the first efforts in automated bug report classification literature that focuses on commercial bugs and uses bilingual (Turkish and English) texts.