Identification and categorization of defects in construction specifications by utilizing NLP

Download
2025-1-08
Madenli, Özgür
Construction specifications are crucial parts of design and contract documents. Defective specification statements can cause not only a faulty outcome but also disputes among project stakeholders, claims for project budget and time, project disruptions and even litigation. Identification of defects in technical sections of construction specifications is challenging due to the extensive document volume, limited resources, and reliance on technical staff's experience. Natural Language Processing (NLP) can facilitate analyzing language, uncovering patterns, providing insights, and overcoming the limitations of manual methods. This research aims to develop a structured framework and implement supervised NLP methods for identifying and categorizing defects in specifications. Dataset includes 175 specifications related to 21 architectural works collected from 16 construction projects. A total of 15569 statements were extracted and manually labeled in four defect categories and 8 Machine Learning models, ranging from shallow to transformer-based models, were trained and tested with combinations of different text representation techniques. Subsequently, a study with ChatGPT was conducted. The research concluded that the pre-trained RoBERTa model outperformed the recognition of defects in construction specifications with a macro F1 score of 91.2% and 98% accuracy. Whereas the performance of the ChatGPT was evaluated as considerably low compared to domain-specific trained ML models. This research offers a data-driven and automated methodology providing construction stakeholders with practical tools to enhance the quality of specifications and decrease disputes by reducing the deficiencies during design, bidding and pre-construction.
Citation Formats
Ö. Madenli, “Identification and categorization of defects in construction specifications by utilizing NLP,” Ph.D. - Doctoral Program, Middle East Technical University, 2025.