AbstractThe advancement in sensors, robotics, and artificial intelligence has enabled a series of methods such as simultaneous localization and mapping (SLAM), semantic segmentation, and point cloud registration to help with the reality capture process. To completely investigate an unknown indoor space, obtaining a general spatial comprehension as well as detailed scene reconstruction for a digital twin model requires a deeper insight into the characteristics of different ranging sensors, as well as corresponding techniques to combine data from distinct systems. This paper discusses the necessity and workflow of utilizing two distinct types of scanning sensors, including depth camera and light detection and ranging sensor (LiDAR), paired with a quadrupedal ground robot to obtain spatial data of a large, complex indoor space. A digital twin model was built in real time with two SLAM methods and then consolidated with the geometric feature extraction methods of fast point feature histograms (FPFH) and fast global registration. Finally, the reconstructed scene was streamed to a HoloLens 2 headset to create an illusion of seeing through walls. Results showed that both the depth camera and LiDAR could handle a large space reality capture with both required coverage and fidelity with textural information. As a result, the proposed workflow and analytical pipeline provides a hierarchical data fusion strategy to integrate the advantages of distinct sensing methods and to carry out a complete indoor investigation. It also validates the feasibility of robot-assisted reality capture in larger spaces.