*Result*: Towards transparent object detection models for construction sites: explainable AI and error classification.

Title:
Towards transparent object detection models for construction sites: explainable AI and error classification.
Authors:
Kim, Junghoon1 (AUTHOR) pable01@snu.ac.kr, Gong, Yue2 (AUTHOR) yue-36.gong@polyu.edu.hk, Chi, Seokho1 (AUTHOR) shchi@snu.ac.kr, Kim, Jung In3 (AUTHOR) jikim07@kookmin.ac.kr, Seo, JoonOh1,2 (AUTHOR) joonoh.seo@polyu.edu.hk
Source:
Advanced Engineering Informatics. Apr2026:Part A, Vol. 71, pN.PAG-N.PAG. 1p.
Database:
Academic Search Index

*Further Information*

*• Grad-CAM-based XAI framework proposed for error classification in construction object detection. • Five systematic error types identified: abnormal viewpoint, small size, occlusion, background, lighting. • Grad-CAM maps transformed into quantitative features enabling automated failure diagnosis. • Classification accuracy reached 94 %, significantly outperforming IoU-based error analysis. • Error-specific augmentation improved detection reliability across synthetic and real-world datasets. Construction site monitoring is essential for ensuring projects are executed as planned and achieving goals in productivity, safety, and quality. However, traditional manual monitoring methods are time-consuming, error-prone, and lack scalability. Deep learning-based object detection offers a promising alternative, but its "black-box" nature hinders understanding of detection failures. This study proposes a Grad-CAM-based explainable AI framework to diagnose and classify detection errors systematically. The framework consists of three main processes: (1) defining major types of detection errors, (2) collecting failed images for each error type, and (3) developing a machine learning-based classification model using Grad-CAM features and detection metrics. Unlike previous approaches that relied on qualitative interpretations, this study converts Grad-CAM heatmaps into quantitative features (e.g., GT influence ratio, activation-to-box distance, cluster counts), enabling automated error classification. Errors were categorized into abnormal viewpoint, small size, occlusion, complex background, and lighting variation, achieving 94% classification accuracy on synthetic data, 85% on real images, and 88% on AI-generated data. This framework enhances transparency and interpretability while supporting model optimization and adaptive deployment for real-world construction site applications. [ABSTRACT FROM AUTHOR]*