Identification and Categorization of Defects in Construction Specifications Utilizing Natural Language Processing


Madenli O., ATASOY ÖZCAN G., DİKMEN TOKER İ.

Journal of Construction Engineering and Management, vol.152, no.5, 2026 (SCI-Expanded, Scopus) identifier identifier

  • Publication Type: Article / Article
  • Volume: 152 Issue: 5
  • Publication Date: 2026
  • Doi Number: 10.1061/jcemd4.coeng-17750
  • Journal Name: Journal of Construction Engineering and Management
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, ICONDA Bibliographic, INSPEC, Public Affairs Index
  • Keywords: Construction specification, Defect, Deficiency, Generative pretrained transformer (GPT), Machine learning (ML), Natural language processing (NLP), Text classification
  • Middle East Technical University Affiliated: Yes

Abstract

Defective specification statements cause not only a faulty outcome but also disputes among project stakeholders, claims for project budget and time, project disruptions, and even litigation. Identifying defects in technical sections of construction specifications is challenging. This research aims to develop a structured defect framework and implement supervised natural language processing methods for identifying and categorizing defects in specifications. The dataset includes 175 specifications related to 21 different architectural works collected from 16 construction projects. Eight machine learning (ML) models, ranging from shallow to transformer-based, were trained and tested with combinations of different text representation techniques. Subsequently, a study using ChatGPT-4o, a GenAI tool, was conducted. The pretrained RoBERTa model outperformed the recognition of defects in construction specifications with a macro F1 score of 91.2% and 98% accuracy. This research offers a data-driven methodology with practical tools to enhance the quality of specifications and decrease disputes by reducing the defective specification statements during design, bidding, and preconstruction.