AbstractWith the emergence of the smart city, there is a growing need for scalable methods that sense how humans interact and use infrastructure in order to model social behaviors relevant to designing sustainable and resilient built environments. Cyber-physical system (CPS) frameworks used to monitor and automate infrastructure systems in smart cities can be extended to sense people to better understand how they use infrastructure systems including social infrastructure (e.g., parks, markets). This paper adopts convolutional neural network (CNN) architectures to automate the detection and spatiotemporal mapping of people using camera data to form a cyber-physical-social system (CPSS) for smart cities. The Mask region based convolutional neural network (R-CNN) detector was adopted and tailored to identify and segment human subjects in real time using camera images with an average speed of 7 frames per second. The Mask R-CNN framework was trained end to end using the Objects in Public Open Spaces (OPOS) image data set that includes classified segmentations of people in public spaces. A two-dimensional/three-dimensional (2D-3D) lifting algorithm based on a monocular camera calibration model was also employed to accurately position detected people in space. Finally, a Hungarian assignment algorithm based on association metrics extracted from detected people was used to assign people to spatiotemporal trajectories. To demonstrate the proposed framework, this study used the Detroit riverfront parks to study how people utilize community parks, which are a form of social infrastructure. The Mask R-CNN detector is proven precise in detecting and classifying the behavior of people in parks with mean average precision well above 85% for all class types defined in the OPOS library. The framework is also shown to be effective in spatially mapping the various uses of park furnishings, leading to better management of parks.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *