AbstractContact-driven accidents involving actuated excavators have led to a significant number of fatalities in the construction industry. The revolving mechanical arm of excavators poses a major risk of contact-driven accidents for workers in its proximity due to its articulated pose. Detecting the 3D pose of excavator arms is thus essential to prevent contact-driven accidents near excavators. Previous works have attempted to estimate 3D excavator poses using sensor-based or computer vision-based methods. However, existing methods require extensive preparation work, such as attaching physical sensors, calibrating stereo cameras, or collecting 3D training data. As a result, existing methods cannot be easily integrated into the current construction workflow and are seldom applied in real-world situations. The authors propose a projection-based 3D pose optimization method that utilizes excavator kinematic constraints to infer 3D excavator poses from monocular image sequences with no dependency on 3D training data. The proposed method first extracts the 2D excavator pose from images using a keypoint region-based convolution neural network. Then, the 2D pose is reconstructed into 3D by enforcing the rigid excavator kinematic constraints (e.g., arm length) and minimizing the 2D reprojection error of the excavator pose. Tests using a 1:14 miniature excavator model showed a 3D position error of 7.3 cm (or 1.03 m when scaled up to real-world dimensions) for keypoints on the excavator pose, demonstrating the capabilities of the proposed method in estimating 3D excavator poses from monocular images. The proximity measuring capacity of the estimated 3D pose was also evaluated, achieving a mean absolute distance error of 4.7 cm (or 0.66 m scaled). The proposed method offers a 3D excavator pose estimation method using only a monocular camera and without relying on 3D training data. The estimated 3D excavator pose enables safety managers to monitor potential contact-driven accidents near excavators and alert workers of unsafe situations and promotes safer working environments for construction workers near excavators.Practical ApplicationsThe authors present a monocular vision-based method for 3D excavator pose estimation, which can serve as the groundwork for monitoring contact-driven accidents near excavators. The method allows safety managers to monitor the 3D pose of an excavator using one single camera (e.g., smartphone, site surveillance camera, or action camera). Unlike previous methods, the proposed method does not require construction professionals to conduct challenging preparation work such as setting up multiple stereo cameras or collecting custom 3D excavator training datasets. For instance, for an excavator 21 m away, the proposed method can estimate the 3D excavator’s keypoint positions with an expected 3D position error of 1.03 m. Paired with other onsite information (e.g., utility line or worker location) collected from existing drawings or using other vision-based methods, the proposed method can measure the proximity between the excavator arm and surrounding objects with an expected error of 0.66 m, enabling proactive safety interventions (e.g., proximity alerts). Such safety interventions can make it safer for workers to conduct construction work near excavators. Though originally devised for excavators, the proposed method can be further adapted to monitor the 3D pose of other construction equipment such as backhoes or crawler cranes, enabling a wider range of safety applications for monitoring contact-driven accidents.