AbstractThis paper presents a systematic method to create universally applicable synthetic training image datasets for computer vision-based construction object detection. The synthetic images created by inserting a virtual object of interest into a real site image allows us to minimize the time and effort for training image data collection and annotation. In addition, the use of synthetic images has an additional benefit that training images can be easily customized for a target construction site by considering the context of the site (e.g., different background scenes, camera positions, and angles) and the possible variability of target objects to be detected (e.g., different sizes, locations, rotation angles, and postures) on images. An automated approach proposed in this study attempts to systematically create the synthetic images using the Unity game engine in which context- and variability-related parameters can be controlled. The proposed method was validated by training a deep learning-based object detection algorithm [i.e., a faster regions with convolutional neural network (R-CNN) model] with synthetic images and testing it on real images from earthwork construction sites to detect an excavator. The CNN models trained with synthetic images showed an average precision value of more than 90%; in particular, the classifier using synthetic images outperformed the one using real site images. The detection results also demonstrated an improved capability to capture the high irregularity of a construction object on images when using techniques of context customization and variability randomization. The findings from this study demonstrate the feasibility and practicality of the use of synthetic images for vision-based approaches in a construction domain. Ultimately, the proposed approach serves as an alternative way to build comprehensive image datasets for construction entities, contributing to facilitating vision-based studies on construction.