A Comparative Study of Contemporary Learning Paradigms in Bug Report Priority Detection


Yilmaz E. H., TOROSLU İ. H., Koksal O.

IEEE Access, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1109/access.2024.3451125
  • Dergi Adı: IEEE Access
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Anahtar Kelimeler: Bidirectional control, bug triaging, Computer bugs, contrastive learning, Contrastive learning, Encoding, machine learning, natural language processing, Natural language processing, Software, software bug report classification, software engineering, Task analysis
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

The increasing complexity of software development demands efficient automated bug report priority classification, and recent advancements in deep learning hold promise. This paper presents a comparative study of contemporary learning paradigms, including BERT, vector databases, large language models (LLMs), and a simple novel learning paradigm, contrastive learning for BERT. Utilizing datasets from bug reports, movie reviews, and app reviews, we evaluate and compare the performance of each approach. We find that transformer encoder-only models outperform in classification tasks measured by the precision, recall, and F1 score transformer decoder-only models despite an order of magnitude gap between the number of parameters. The novel use of contrastive learning for BERT demonstrates promising results in capturing subtle nuances in text data. This work highlights the potential of advanced NLP techniques for automated bug report priority classification and underscores the importance of considering multiple factors when developing models for this task. The paper’s main contributions are a comprehensive evaluation of various learning paradigms, such as vector databases and LLMs, an introduction of contrastive learning for BERT, an exploration of applicability to other text classification tasks, and a contrastive learning procedure that exploits ordinal information between classes.