AbstractWith the increase in public concern about the aging of urban infrastructure and the associated risk of safety accidents, it is important to maintain the safety and serviceability of urban infrastructure in accordance with user satisfaction. Although many studies have attempted to consider user experience and satisfaction based on user surveys and civil complaint data analysis, they have had difficulty in identifying user dissatisfaction factors where users feel unsafe or uncomfortable while using the infrastructure. The main purpose of the research presented here is to understand user experience and satisfaction with urban infrastructure by text mining self-written civil complaint data. To achieve this objective, the researchers adopted the following procedures: (1) development of a civil complaint thesaurus for the text mining of civil complaint data; (2) text preprocessing of civil complaint data by using the thesaurus; and (3) keyword extraction and recognition of the relationships between the keywords to explore user-experience factors related to urban infrastructure. The research team used 2,945 bridge complaint data records and 404 tunnel complaint data records in text format from the Korean Safety e-Report database. From the collected data, the researchers developed a civil complaint thesaurus with 47 semantic relationships between words, such as Korean compound words, synonyms, and hypernym– hyponyms. As a result of keyword extraction, “breakage,” “accident,” and “road” for bridge complaints, and “entrance,” “accident,” and “breakage” for tunnel complaints were the selected words representing user experiences, and were visualized in a tag cloud. Also, critical user-experience factors such as unsafe or uncomfortable situations on bridge roads (e.g., “breakage,” “construction,” and “pothole”), and dissatisfaction factors at tunnel entrances (e.g., “streetlight,” “view,” and “sign”) were explored using semantic network analysis. The outcome of this research will contribute to identifying user-experience factors from civil complaint data and improving the safety and serviceability of urban infrastructure by considering user experience and satisfaction in infrastructure maintenance practices.IntroductionThe safety and serviceability of urban infrastructures are essential conditions for the quality of public life and a safe urban system. As urban infrastructures age and fatal safety accidents related to aging infrastructures occur all over the world, public concerns about urban infrastructures are increasing (ASCE 2021; Ellingwood 2010). For example, in Korea, more than 75% of the public in 2018 perceived the safety and service level of the country’s infrastructure as below average (Statistics Korea 2019). To improve the public’s comfort and satisfaction with urban infrastructures, it is important to maintain the condition and service quality of these infrastructures in accordance with user experience and satisfaction. Here, the infrastructure user means the consumer and the beneficiary of urban infrastructure services.To maintain the safety and serviceability of urban infrastructures, as part of their maintenance practices, managers periodically inspect structural and visual damage to infrastructure according to their guidelines and manuals for inspection and diagnosis, and conduct repairs, reinforcement, and rehabilitation based on the inspection results (Chang and Chi 2019; FHWA 2020; Kobayashi and Kaito 2016; Lim and Chi 2019; MOLIT 2021). The manager’s perspective, however, may not match user experience and satisfaction. In determining the rehabilitation demand for road pavements, for instance, the affordable results of physical pavement roughness measures may not satisfy a user’s perception of the pavement’s roughness; the user may have higher or different expectations (Shafizadeh and Mannering 2006). Similarly, users may, for example, have concerns about or experience discomfort with the lighting or shape of bridge railings, which may cause safety accidents on the bridge (MOIS 2018), but inspection manuals in Korea do not stipulate that managers must check these factors.Many studies have attempted to incorporate user experiences derived from the results of user surveys into infrastructure maintenance practices (Abdul-Rahman et al. 2015; Gopikrishnan and Paul 2017; Kang and Lee 2012; Shafizadeh and Mannering 2006); however, these studies have considered user satisfaction only for the issues stipulated in a questionnaire that was designed from a manager’s perspective. Other researchers have tried to analyze civil complaint data written by users to explore user experience and dissatisfaction factors for improving service quality and user satisfaction for urban infrastructure, such as a water distribution system, a metro system, and a building indoor environment (Drake and Zechman 2012; Haider et al. 2016; Park et al. 2015; Teng et al. 2018; Villeneuve and O’Brien 2020). Drake and Zechman (2012), for example, attempted to analyze user complaints regarding a water distribution system by exploring the location and time of complaints. Despite these efforts, previous approaches have been limited in mining the contents of the civil complain data to determine which factors cause users to feel unsafe or uncomfortable.Thus, the primary objective of the research presented in this work is to understand user experience and satisfaction with urban infrastructure by text mining self-written civil complaint data. To accomplish this objective, this study comprised the following elements: (1) developed a civil complaint thesaurus for facilitating the text mining of civil complaint data; (2) preprocessed civil complaint data in text format by using the thesaurus; and (3) extracted keywords and identified the relationship between the keywords to explore user-experience factors in urban infrastructure. The user-experience factors refer to instances where users feel unsafe or uncomfortable while using the infrastructure. The outcome of this research will contribute to identifying user-experience factors from civil complaint data and improving the safety and serviceability of urban infrastructure by considering user experience and satisfaction in infrastructure maintenance practices. The research team used civil complaint data from the Korean Safety e-Report database, which is managed by the Ministry of the Interior and Safety (MOIS) in Korea. Because the public directly reports risk situations in the field through mobile, including urban infrastructure risks, to the Korean Safety e-Report, the complaint data collected from the database can provide invaluable data for this study.This article is organized into five sections. After the introduction section, the Literature Review section discusses related works to consider user satisfaction with urban infrastructure and a review of text mining approaches in the construction domain. The Research Methodology section introduces the data collection, development of the civil complaint thesaurus, text preprocessing, and exploration of user-experience factors. Results of the text mining are discussed in the Results and Discussion section. Finally, we present conclusions, research contributions, and recommendations for future research.Literature ReviewRelated Works Considering User Satisfaction with Urban InfrastructureTo consider user satisfaction with infrastructure maintenance, many studies have been conducted to analyze user opinions on each type of urban infrastructure and apply the results of analyses to improve the safety and serviceability of urban infrastructure. Abdul-Rahman et al. (2015) identified building performance requirements that could improve user satisfaction with building facilities maintenance through user surveys. Shafizadeh and Mannering (2006) collected pavement roughness data on a scale of 1 to 5 via a user survey and developed a user perception model for the roughness from the data in order to take into account users’ perspectives in determining the demand for road pavement rehabilitation in a highway system. A user survey was also used to capture user satisfaction for establishing a bicycle level-of-service model (Kang and Lee 2012). These studies, however, considered user satisfaction only with the main issues designated by the manager.On the other hand, there have been efforts to analyze civil complaint data to identify user experience and dissatisfaction information to improve the service quality of urban infrastructure. Haider et al. (2016) used user complaint data for a water supply system to identify the distribution problems and seriousness of user complaints, and the risks leading to user dissatisfaction, to enhance the reliability of the system. Teng et al. (2018) also analyzed metro complaint data to summarize civil complaints according to their location, time, and category. In spite of these attempts to consider civil complaint data, it is still difficult to directly identify user dissatisfaction factors that indicate feeling unsafe or uncomfortable while using urban infrastructure. Therefore, this study will improve our understanding of user experience and satisfaction by discovering user-experience factors from the civil complaints data in a self-written text format.Review of Text Mining Approaches in the Construction DomainText mining is defined as the process of extracting meaningful information and contexts from unstructured data in text format (Baker et al. 2020; Manning et al. 2008; Zhang et al. 2019a). It aims to solve problems such as information extraction, text categorization, text summarization, and information retrieval (Miner et al. 2012). For these purposes, many researchers have analyzed large amounts of text data by utilizing the automated techniques of text mining, including keyword extraction, word network analysis, topic modeling, opinion mining, and sentiment analysis, in various domains such as business, health science, and education (He 2013; He et al. 2013; Jung and Lee 2020).In the construction domain, text mining has also been conducted to extract meaningful information, to classify text, and to discover interesting patterns or trends from construction management documents, accident reports, contractual documents, public opinions or complaints, and others. At the document level, construction project documents were classified based on the key project components of the documents (Caldas and Soibelman 2003) and the clustering results from the documents’ textual similarities (Al Qady and Kandil 2014). Moon et al. (2018) also tried to ascertain the issues of the global construction market using keyword extraction and visualization from text data related to the global construction market.In particular, some studies have recently attempted to analyze public opinions or complaints related to infrastructure management and governance. Zhong et al. (2019) labeled building quality complaint (BQC) text data according to 12 complaint subjects (e.g., leakage, hollowing or cracking, and construction impact) and developed a convolutional neural network (CNN)–based approach to automatically classify the BQC documents according to these subjects. Villeneuve and O’Brien (2020) conducted text mining of Airbnb reviews to explore indoor environmental quality (IEQ). Seasonal trends and causes of the IEQ-related complaints were discovered using keyword extraction along with term frequencies (TFs), and sentiment analysis was then performed to understand the sentimental characteristics of the complaints. Zhou et al. (2021) analyzed public opinions of infrastructure megaprojects extracted from social media platforms by utilizing topic modeling and sentiment analysis to understand the major topics and perceptions that potential users consider for megaprojects. Similarly, public opinions concerning the metro system were used for topic modeling and sentiment analysis (Zhang et al. 2019b). These studies have succeeded in identifying major topics of construction-domain documents, particularly public issues or sentiments about urban infrastructure that are evident from public opinions or complaints. However, these results could not concretely explain the areas in which users feel unsafe or uncomfortable while using the infrastructure, which are factors that are crucial to urban infrastructure maintenance. Therefore, it is necessary to extract detailed information at the level of sentences and words to explore user-experience factors.On the other hand, many studies have attempted to analyze construction-domain text for information extraction and text classification at the level of the sentence or word. Zhang and El-Gohary (2013) tried to extract specific requirements from construction regulatory documents by applying information extraction rules made up of syntactic and semantic text features. Tixier et al. (2016) analyzed construction injury reports to extract energy, injury, and body types based on rule-based approaches. In addition, some studies have verified that machine learning algorithms show powerful performance in extracting several predefined kinds of information from construction-domain text (Goh and Ubeynarayana 2017; Hassan and Le 2020; Kim and Chi 2019; Moon et al. 2020, 2021; Salama et al. 2013; Wu et al. 2021). For example, Kim and Chi (2019) identified hazard objects, hazard positions, work processes, and accident results from construction accident cases by using a conditional random field as well as semantic rules. Hassan and Le (2020) tried to automatically identify requirements from construction contract documents using naïve Bayes, support vector machines, logistic regression, and feedforward neural network. Moon et al. (2020) extracted bridge damage factors (i.e., element, damage, and cause) from bridge inspection reports based on a recurrent neural network. These approaches, however, were restricted to a few information categories determined by the researchers; thus, they do not align well with the objective of the present study, which is to explore unexpected user-experience factors (e.g., object, state, time, location, and cause) to understand user experience and satisfaction with urban infrastructure. In addition, the aforementioned studies using machine learning algorithms required a large amount of data for model training, but the target data in this study (i.e., the civil complaints data in a self-written text format) were not enough to fulfill the requirement.To accomplish the objective of the this study, we collected civil complaint data that described the areas where users felt unsafe or uncomfortable while using infrastructure, and we conducted preprocessing to an appropriate level for analysis. Then, we used simple and intuitive text-mining techniques—keyword extraction based on TF calculation to extract major dissatisfaction factors, and semantic network analysis (SNA), also known as word network analysis, to identify meaningful relationships among the keywords. A detailed explanation of them is presented in the following section.Research MethodologyFig. 1 illustrates the proposed research methodology. As shown in the figure, the methodology is organized into four main stages. The first stage of the research was to collect civil complaint data in text format from the Korean Safety e-Report. Second, a civil complaint thesaurus was developed to facilitate text mining of the collected data. Using the thesaurus, the collected text data were preprocessed based on four steps. The detailed text preprocessing implemented in this research is discussed in the following sections. Last, to explore the user-experience factors, keywords were extracted from the preprocessed complaint data by conducting a TF calculation, and the keywords were visualized in the form of a tag cloud. The research team then identified the relationships among the keywords by using SNA, and plotted the relationships into a network graph. The research methodology was developed and implemented using Python version 3.6.8.Data CollectionTo conduct this study, 2,945 bridge complaint data records and 404 tunnel complaint data records were collected from the Korean Safety e-Report database from 2017 to 2018. Bridges and tunnels are representative urban infrastructures used by many people in Korea, as in other countries, and users reported diverse types of civil complaints, including risk situations (e.g., risk of traffic accidents due to potholes in a bridge ramp and the risk of pedestrian accidents from tiles falling off the outer wall of a tunnel), structural factors (e.g., exposed steel of a bridge expansion-joint and a crack in a bridge deck), and uncomfortable factors (e.g., dim lighting inside the tunnel and the need to clean around drainage areas).As shown in Table 1, the collected civil complaint data are listed in text format for both infrastructure types. A total of 38,860 and 5,544 space-separated words are respectively identified in the 2,945 bridge complaint data and the 404 tunnel complaint data; one complaint data field consists of an average 13.3 space-separated words. More specifically, the civil complaint data include terms representing objects or states of dissatisfaction with urban infrastructures (e.g., “a pothole in the entrance road,” “the exposed steel of the bridge substructure,” “the tiles dropping off,” and “the cracks in the road surface”), among which there are some domain-specific terms related to construction or infrastructure maintenance (e.g., “pothole,” “exposed-steel,” “pier,” and “expansion-joint”). The data also include general terms expressing user feelings such as discomfort and dissatisfaction or requesting maintenance actions (e.g., “dangerous,” “risk,” “dark,” “necessary,” and “action”), although the terms have little relation to the dissatisfaction factors.Table 1. Examples of the collected civil complaint dataTable 1. Examples of the collected civil complaint dataCategoryRaw data (English translation)Bridge complaintsThere is a pothole in the entrance road of the bridge, which is dangerous to traffic.It is necessary to repair the exposed steel of the bridge substructure.The bridge drainage is blocked by soil and dirt.Tunnel complaintsThere is a risk of a pedestrian safety accident as the tiles on the outer wall of the tunnel are dropping off.The cracks in the road surface in the tunnel are progressing seriously.It is dark and dangerous since the entrance lighting is off at night.Development of a Civil Complaint ThesaurusA thesaurus is a dictionary that defines the semantic relationships among terms, including synonyms, hypernyms, and hyponyms. It has been used as an information retrieval method to extend queries and resolve query inconsistencies, and can also be applied in text preprocessing for other text mining techniques, such as keyword analysis and text classification (Bang et al. 2006; Kim and Chi 2019; Xu and Yu 2010). In this research, when users reported their civil complaints via the Korean Safety e-Report, they could use different expressions to represent the same or almost the same meaning, because there is no standard or set format for the reports. For instance, terms could be represented using synonyms such as “expansion-joint or joint” and “pothole or sinkhole,” and some hypernyms–hyponyms such as “heavy-vehicle–dump-truck.” Thus, the thesaurus helps to replace semantically similar terms with a single representative word, reducing the number of words used for analysis.The civil complaint thesaurus proposed in this study was constructed to define the semantic relationships between the terms in civil complaints by scrutinizing the collected data in detail based on well-structured documents and utilizing expert interviews. In general, synonym, hypernym–hyponym, and abbreviation relationships between common terms were classified by referring to the Standard Korean Language Dictionary, which is distributed by the National Institute of the Korean Language (NIKL) in Korea (NIKL 2020b), and the most-used terms in the collected data were selected as representative words. Some terms that have almost the same meaning in the context of the civil complaint data records (e.g., “pothole,” “hole,” and “dent” in the road) were included as synonyms even if they are not synonymous as defined by this dictionary. In the case of domain-specific terms in the construction sector or infrastructure maintenance field (e.g., “expansion-joint,” “exposed-steel,” and “bearing”), synonym and hypernym–hyponym relationships were distinguished based on the definitions of domain-specific terms in the Korea Construction Standard Glossary (MOLIT 2020) and the Guidelines for Maintenance and Performance Assessments (MOLIT 2019). Also, the original terms for foreign languages written in Korean (e.g., deck 데크/덱, slab 슬래브, and grating 그레이팅) were identified through the Korean loanword orthography (NIKL 2020a) and replaced with representative words with the same meaning. Finally, the constructed thesaurus was verified by experts who have a profound knowledge of infrastructure inspection and maintenance terminology.Text PreprocessingThe aim of text preprocessing in this study was to ensure that the collected data in text format comprised only meaningful terms for analysis. The main steps included cleaning and normalization, tokenization, part-of-speech (POS) tagging and noun extraction, and stopwords removal based on general text mining procedures (Manning et al. 2008; Moon et al. 2021; Weiss et al. 2005). First, in the data-cleaning step, the researchers eliminated noisy data that did not affect the analysis results, such as punctuation marks (e.g., “!,” “?,” and “-”) and index numbers. After that, terms with the same or similar meaning were normalized to a single representative word in accordance with the previously defined civil complaint thesaurus. For instance, as shown in Fig. 2, in the first step, the sentence “The exposed steel of the bridge joint was caused by dump trucks!” would be replaced with the sentence “The exposed-steel of the bridge expansion-joint was caused by heavy-vehicle.”Second, the normalized data were tokenized into semantic words to explore user-experience factors. Tokenization involves parsing every sentence into individual terms, called tokens (Moon et al. 2020). Then, in the next step, the research team extracted nouns from the tokenized data, as nouns in Korean commonly represent the critical information concerning users’ experiences and satisfaction (e.g., deck, expansion-joint, breakage, and pothole). For this purpose, every token was tagged with its POS, and only the tokens with a noun tag were extracted. To conduct these two sequential steps, this study utilized the Komoran POS tagger (Shin 2014) provided in the KoNLPy Python package (Park 2014), which is widely used for Korean tokenization. The Komoran POS tagger implements the tokenization and POS tagging process based on a dictionary (i.e., Korean token and POS for each term) predefined by researchers. The research team updated the predefined dictionary in accordance with the developed civil complaint thesaurus. Therefore, the words discussed in the thesaurus could be split into one token. For example, the earlier sentence, “The exposed-steel of the bridge expansion-joint was caused by heavy-vehicle” would be tokenized into 10 words: “the,” “exposed-steel,” “of,” “the,” “bridge,” “expansion-joint,” “was,” “caused,” “by,” and “heavy-vehicle.” Among them, only four nouns (i.e., “exposed-steel,” “bridge,” “expansion-joint,” and “heavy-vehicle”) would be extracted (Fig. 2).Finally, stopwords removal involves dropping extremely common terms with little analysis value (Zou et al. 2017). A stopwords list is generally organized by sorting the most frequent terms and manually excluding informative terms in the specific text. The stopwords list of the collected civil complaint data included the names of infrastructure types (e.g., “bridge” and “tunnel”) and words less related to dissatisfaction objects or states (e.g., “risk,” “safety,” “action,” and “need”). Taking the previous example, the term “bridge” included in the stopwords list was deleted from the four nouns extracted; that is, after these text preprocessing steps, the original complaint sentence, “The exposed steel of the bridge joint was caused by dump trucks!” was eventually preprocessed into three words: “exposed-steel,” “expansion-joint,” and “heavy-vehicle,” as shown in Fig. 2.Exploration of User-Experience FactorsKeyword ExtractionKeyword extraction, also known as keyword analysis, is a text mining technique used to identify the most important words and features in text data. It is a basic process to understand text data and plays an essential role in text mining problems, such as information extraction, text categorization, text summarization, and information retrieval (Berry and Kogan 2010). In this research, keywords that represent user-experience factors were extracted from the preprocessed complaint data for each type of infrastructure based on TF calculation. TF, which is one of the traditional methods of keyword extraction, is an indicator of how many times the word appears in the text data. It has been popular for extracting key features because of its simple and intuitive calculation process (Baek et al. 2021; Moon et al. 2018; Villeneuve and O’Brien 2020; Weiss et al. 2005).To effectively show the results of keyword extraction, the research team then visualized the keywords in a tag cloud. A tag cloud is a common display format that provides an intuitive overview of text data, depicting words arranged in space and varied in size, color, and position based on word frequency, categorization, and significance (Sun et al. 2020). In this research, the keywords with a higher TF were displayed in larger fonts.Relationship Recognition between KeywordsIn the next process, SNA was conducted to recognize the relationships between the extracted keywords. SNA is a technique used to automatically discover and visualize semantic networks based on unstructured data. The semantic network refers to domain-specific knowledge that represents semantic relations between concepts in a network. The concepts are described as the network’s “nodes,” and the relations between concepts are described as the network’s “edges” (Drieger 2013; Lehmann 1992; Richards and Barnett 1993). That is, SNA can identify the information and knowledge in a specific field by exploring the nodes and edges that make up a semantic network. In particular, in the case of text data, which is a type of unstructured data, the text is composed of words, and the knowledge obtained from the text is a network structure formed by the relationships between the words. Under this conception, SNA for text—which is referred to as word network analysis—is utilized to discover the semantic relations between words (i.e., concepts in SNA) and to build a structure of the text network (Popping 2003; Yoo et al. 2019). In a text network, a node corresponds to a word, and an edge indicates the relationship between words (Jung and Lee 2020).The semantic network in text is commonly based on a co-occurrence relationship between words. Co-occurrence is defined as the simultaneous appearance of words in a sentence, paragraph, or text (Fariña García et al. 2021). In this research, co-occurrence between the ith word (i=1,…,n) and the jth word (j=1,…,n), Co(wi,wj)(i≠j), is calculated as (1) Co(wi,wj)=∑k=1c[num(wi|dk)×num(wj|dk)]where num(w|dk) = how many times word w appears in the kth civil complaint data, dk; n = number of types of words appearing in the entire civil complaint data (i.e., the number of nodes in a network); and c = total number of civil complaint data records. We conducted SNA based on co-occurrences between keywords and visualized the structure of the word co-occurrence network to recognize the relationship between the extracted keywords.To understand the characteristics of the semantic network, density and centrality can be calculated. The density of network N, density(N), is defined as the ratio of the number of relations between the nodes (i.e., the number of edges) to the total number of possible relations, as follows: (2) where e(N) = number of edges in the network N; and n = number of nodes in a network N. Networks with high density allow for active sharing between the nodes and a rapid spread of information through the entire network. Centrality is an indicator that represents the extent to which a node is located at the center of the entire network. Among the types of centrality, degree centrality is the number of nodes to which one node is directly connected, and the node with a large number of other nodes with relations has a high degree of centrality (Jeon and Kim 2020). In this research, the degree centrality of the ith node (i.e., the ith word), DC(nodei), and the weighted-degree centrality of the ith node, WDC(nodei), are calculated as (3) (4) WDC(nodei)=∑j=1nCo(wi,wj)∑i=1n∑j=1nCo(wi,wj)where e(nodei) means the number of edges directly connected to the ith node. We identified the coherence of networks by calculating the network density for each infrastructure type (i.e., bridge and tunnel). Also, the influential keywords with a high co-occurrence between other words were identified by extracting central words with a high degree of centrality. The user-experience factors could be inferred from the word relationships connected to the central words.ConclusionsThis study aimed to understand user experience and satisfaction with urban infrastructure by applying text mining techniques to civil complaint data. The researchers first collected bridge and tunnel complaint data in text format. By scrutinizing the data in detail, they developed a civil complaint thesaurus with 47 semantic relationships between words, such as Korean compound words, synonyms, and hypernym–hyponyms. After the text preprocessing stage using the thesaurus, to explore user-experience factors, keywords for bridge and tunnel complaints were extracted based on the TF calculation (e.g., “breakage,” “accident,” and “road” for bridge complaints and “entrance,” “accident,” and “breakage” for tunnel complaints). The relationships between the keywords were then identified by utilizing SNA. As a result of the word networks, the objects of “breakage” on a bridge (e.g., “deck,” “railing,” and “pier”), unsafe or uncomfortable situations on the bridge road (e.g., “breakage,” “construction,” and “pothole”), and dissatisfaction factors at the tunnel “entrance” (e.g., “streetlight,” “view,” and “sign”) were explored. The practical applicability of these factors was verified through comparison with managers’ inspection guidelines for urban infrastructures.This research offers several contributions to urban infrastructure maintenance. First, the civil complaint thesaurus can be used to improve model performance in further text mining studies using civil complaint data, and the 47 semantic relationships between words developed in the research can form a basis for the construction of a thesaurus specialized for infrastructure maintenance. Second, the results of this research contribute to identifying user-experience factors from civil complaint data and improving the safety and serviceability of urban infrastructure by considering user experience and satisfaction in infrastructure maintenance practices. Specifically, as demonstrated in the discussion section, some of the derived user-experience factors were consistent with those in the existing inspection manuals used in practice, and others could be incorporated into the manuals so that managers can inspect users’ satisfaction periodically and thoroughly. Furthermore, if civil complaint data are analyzed in real time, causes of safety accidents or maintenance requirements that may be overlooked by managers or that appear between inspection cycles can be discovered and remedied before accidents occur. Finally, the research findings propose a new paradigm for the infrastructure maintenance domain. That is, existing practices have focused on infrastructure condition and deterioration of performance from the manager’s point of view; the new paradigm proposed by this research focuses on the experience and satisfaction of users, who are the most crucial stakeholders in the life cycle of urban infrastructure.Further opportunities exist for improvements to enhance the analysis. These improvements could include verifying the proposed research methodology by applying these approaches to civil complaint data of diverse types of infrastructures except bridges and tunnels and the data extracted from various online platforms. In addition, it may be possible to gather more information to help understand user experience and satisfaction by utilizing various text mining techniques—for example, topic modeling could be used to automatically derive major topics of civil complaints from a large volume of complaint data, and user-experience factors could be identified for each derived topic. Users’ perceptions of each type of infrastructure could also be compared by applying sentiment analysis to civil complaint data for various types of infrastructures.References Abdul-Rahman, H., C. Wang, S. N. Kamaruzzaman, F. A. Mohd-Rahim, M. S. Mohd-Danuri, and K. Lee. 2015. “Case study of facility performance and user requirements in the University of Malaya research and development building.” J. Perform. Constr. Facil. 29 (5): 04014131. https://doi.org/10.1061/(ASCE)CF.1943-5509.0000629. Baek, S., W. Jung, and S. H. Han. 2021. “A critical review of text-based research in construction: Data source, analysis method, and implications.” Autom. Constr. 132 (Dec): 103915. https://doi.org/10.1016/j.autcon.2021.103915. Bang, S. L., J. D. Yang, and H. J. Yang. 2006. “Hierarchical document categorization with k-NN and concept-based thesauri.” Inf. Process. Manage. 42 (2): 387–406. https://doi.org/10.1016/j.ipm.2005.04.003. Berry, M. W., and J. Kogan. 2010. Text mining: Applications and theory. Chichester, UK: Wiley. Drake, K., and E. M. Zechman. 2012. “Using consumer complaints to characterize contamination events in a water distribution system.” In Proc., World Environmental and Water Resources Congress 2012, 2247–2252. Reston, VA: ASCE. Ellingwood, B. R. 2010. “The role of structural aging in achieving life-cycle performance goals of civil infrastructure.” In Proc., 2010 Structures Congress, 2803–2808. Reston, VA: ASCE. Fariña García, M. C., V. L. De Nicolás De Nicolás, J. L. Yagüe Blanco, and J. L. Fernández. 2021. “Semantic network analysis of sustainable development goals to quantitatively measure their interactions.” Environ. Dev. 37 (Mar): 100589. https://doi.org/10.1016/j.envdev.2020.100589. Goh, Y. M., and C. U. Ubeynarayana. 2017. “Construction accident narrative classification: An evaluation of text mining techniques.” Accid. Anal. Prev. 108 (Nov): 122–130. https://doi.org/10.1016/j.aap.2017.08.026. Gopikrishnan, C. S., and V. K. Paul. 2017. “Intervention strategy for enhanced user satisfaction based on user requirement related bpas for government residential buildings.” In Proc., Int. Conf. on Sustainable Infrastructure 2017, 389–404. Reston, VA: ASCE. Haider, H., R. Sadiq, and S. Tesfamariam. 2016. “Risk-based framework for improving customer satisfaction through system reliability in small-sized to medium-sized water utilities.” J. Manage. Eng. 32 (5): 04016008. https://doi.org/10.1061/(ASCE)ME.1943-5479.0000435. Hassan, F. U., and T. Le. 2020. “Automated requirements identification from construction contract documents using natural language processing.” J. Leg. Aff. Dispute Resolut. Eng. Constr. 12 (2): 04520009. https://doi.org/10.1061/(ASCE)LA.1943-4170.0000379. Jeon, S.-W., and J.-Y. Kim. 2020. “An exploration of the knowledge structure in studies on old people physical activities in Journal of Exercise Rehabilitation: By semantic network analysis.” J. Exercise Rehabil. 16 (1): 69–77. https://doi.org/10.12965/jer.2040010.005. Jung, H., and B. G. Lee. 2020. “Research trends in text mining: Semantic network and main path analysis of selected journals.” Expert Syst. Appl. 162 (Dec): 113851. https://doi.org/10.1016/j.eswa.2020.113851. Kobayashi, K., and K. Kaito. 2016. “Big data-based deterioration prediction models and infrastructure management: Towards assetmetrics.” Struct. Infrastruct. Eng. 13 (1): 84–93. https://doi.org/10.1080/15732479.2016.1198407. Manning, C. D., P. Raghaven, and H. Schutze. 2008. Introduction to information retrieval. New York: Cambridge University Press. Miner, G., J. Elder, A. Fast, T. Hill, R. Nisbet, and D. Delen. 2012. Practical text mining and statistical analysis for non-structured text data applications. Waltham, MA: Academic Press. MOIS (Ministry of the Interior and Safety). 2018. Korean safety e-Report. Sejong, Korea: MOIS. MOLIT (Ministry of Land Infrastructure and Transport). 2019. Guidelines for maintenance and performance assessments. Sejong, Korea: MOLIT. MOLIT (Ministry of Land Infrastructure and Transport). 2020. Korea construction standard glossary. Sejong, Korea: MOLIT. MOLIT (Ministry of Land Infrastructure and Transport). 2021. Special act on the safety control and maintenance of establishments. Sejong, Korea: MOLIT. Moon, S., S. Chung, and S. Chi. 2020. “Bridge damage recognition from inspection reports using NER based on recurrent neural network with active learning.” J. Perform. Constr. Facil. 34 (6): 04020119. https://doi.org/10.1061/(ASCE)CF.1943-5509.0001530. Moon, S., G. Lee, S. Chi, and H. Oh. 2021. “Automated construction specification review with named entity recognition using natural language processing.” J. Constr. Eng. Manage. 147 (1): 04020147. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001953. Moon, S., Y. Shin, B. G. Hwang, and S. Chi. 2018. “Document management system using text mining for information acquisition of international construction.” KSCE J. Civ. Eng. 22 (12): 4791–4798. https://doi.org/10.1007/s12205-018-1528-y. Park, L. 2014. “KoNLPy: Korean NLP in Python.” Accessed November 10, 2020. https://konlpy.org/. Park, M., J. Kim, H. Kim, S. Do, S. Kim, and J. Shim. 2015. “Determinants of process setting target level of service through the analysis of complaints data caused by infra facilities.” J. Korean Soc. Hazard Mitigation 15 (3): 91–98. https://doi.org/10.9798/KOSHAM.2015.15.3.91. Richards, W. D., and G. A. Barnett. 1993. Progress in communication sciences. Norwood, NJ: Ablex Publishing Corp. Statistics Korea. 2019. A survey on public awareness of social safety. Daejeon, South Korea: Statistics Korea. Sun, J., K. Lei, L. Cao, B. Zhong, Y. Wei, J. Li, and Z. Yang. 2020. “Text visualization for construction document information management.” Autom. Constr. 111 (Mar): 103048. https://doi.org/10.1016/j.autcon.2019.103048. Teng, J., C.-P. Guo, J.-L. Li, Y.-Q. Chen, and X.-Z. Yang. 2018. “The method of analyzing metro complaint data and its application.” In Proc., 17th COTA Int. Conf. of Transportation Professionals, 2005–2016. Reston, VA: ASCE. Tixier, A. J. P., M. R. Hallowell, B. Rajagopalan, and D. Bowman. 2016. “Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports.” Autom. Constr. 62 (Feb): 45–56. https://doi.org/10.1016/j.autcon.2015.11.001. Weiss, S. M., N. Indurkhya, T. Zhang, and F. Damerau. 2005. Text mining predictive methods for analyzing unstructured information. New York: Springer. Wu, C., X. Wang, P. Wu, J. Wang, R. Jiang, M. Chen, and M. Swapan. 2021. “Hybrid deep learning model for automating constraint modelling in advanced working packaging.” Autom. Constr. 127 (Jul): 103733. https://doi.org/10.1016/j.autcon.2021.103733. Yoo, M., S. Lee, and T. Ha. 2019. “Semantic network analysis for understanding user experiences of bipolar and depressive disorders on Reddit.” Inf. Process. Manage. 56 (4): 1565–1575. https://doi.org/10.1016/j.ipm.2018.10.001. Zhang, F., H. Fleyeh, X. Wang, and M. Lu. 2019a. “Construction site accident analysis using text mining and natural language processing techniques.” Autom. Constr. 99 (Mar): 238–248. https://doi.org/10.1016/j.autcon.2018.12.016. Zhang, J., and N. M. El-Gohary. 2013. “Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking.” J. Comput. Civ. Eng. 30 (2): 04015014. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000346. Zhang, Y., D. Li, and C. Li. 2019b. “Public transportation analysis based on social media data.” In Proc., 19th COTA Int. Conf. of Transportation Professionals, 1517–1529. Reston, VA: ASCE. Zhong, B., X. Xing, P. Love, X. Wang, and H. Luo. 2019. “Convolutional neural network: Deep learning-based classification of building quality problems.” Adv. Eng. Inf. 40 (Apr): 46–57. https://doi.org/10.1016/j.aei.2019.02.009. Zou, Y., A. Kiviniemi, and S. W. Jones. 2017. “Retrieving similar cases for construction project risk management using natural language processing techniques.” Autom. Constr. 80 (Aug): 66–76. https://doi.org/10.1016/j.autcon.2017.04.003.
Source link
