Identification and Categorization of Defects in Construction Specifications Utilizing Natural Language Processing


Madenli O., ATASOY ÖZCAN G., DİKMEN TOKER İ.

Journal of Construction Engineering and Management, cilt.152, sa.5, 2026 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 152 Sayı: 5
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1061/jcemd4.coeng-17750
  • Dergi Adı: Journal of Construction Engineering and Management
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, ICONDA Bibliographic, INSPEC, Public Affairs Index
  • Anahtar Kelimeler: Construction specification, Defect, Deficiency, Generative pretrained transformer (GPT), Machine learning (ML), Natural language processing (NLP), Text classification
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Defective specification statements cause not only a faulty outcome but also disputes among project stakeholders, claims for project budget and time, project disruptions, and even litigation. Identifying defects in technical sections of construction specifications is challenging. This research aims to develop a structured defect framework and implement supervised natural language processing methods for identifying and categorizing defects in specifications. The dataset includes 175 specifications related to 21 different architectural works collected from 16 construction projects. Eight machine learning (ML) models, ranging from shallow to transformer-based, were trained and tested with combinations of different text representation techniques. Subsequently, a study using ChatGPT-4o, a GenAI tool, was conducted. The pretrained RoBERTa model outperformed the recognition of defects in construction specifications with a macro F1 score of 91.2% and 98% accuracy. This research offers a data-driven methodology with practical tools to enhance the quality of specifications and decrease disputes by reducing the defective specification statements during design, bidding, and preconstruction.