IEEE Access, 2024 (SCI-Expanded)
The increasing complexity of software development demands efficient automated bug report priority classification, and recent advancements in deep learning hold promise. This paper presents a comparative study of contemporary learning paradigms, including BERT, vector databases, large language models (LLMs), and a simple novel learning paradigm, contrastive learning for BERT. Utilizing datasets from bug reports, movie reviews, and app reviews, we evaluate and compare the performance of each approach. We find that transformer encoder-only models outperform in classification tasks measured by the precision, recall, and F1 score transformer decoder-only models despite an order of magnitude gap between the number of parameters. The novel use of contrastive learning for BERT demonstrates promising results in capturing subtle nuances in text data. This work highlights the potential of advanced NLP techniques for automated bug report priority classification and underscores the importance of considering multiple factors when developing models for this task. The paper’s main contributions are a comprehensive evaluation of various learning paradigms, such as vector databases and LLMs, an introduction of contrastive learning for BERT, an exploration of applicability to other text classification tasks, and a contrastive learning procedure that exploits ordinal information between classes.