AbstractInfiltration of groundwater through reinforced concrete pipe (RCP) joints under hydrostatic pressure has been a major costly challenge in municipal sewer network systems. Analysis of an exclusive designwise infiltration test data of RCP joints showed that conventional regression analysis failed to produce reliable predictions. Accordingly, tree-based machine-learning techniques including random forest, extra trees, and gradient boosting classifiers have been deployed in this study to create reliable models. A large designwise data set identifying failure of RCP joints and the effect of key design parameters was collected using a novel experimental program. Due to the resulting unbalanced experimental data set, oversampling techniques including synthetic minority over-sampling technique (SMOTE) and density based synthetic minority over-sampling technique (DBSMOTE) were employed to enhance predictive performance. Gradient boosting coupled with DBSMOTE offered a robust machine-learning model for predicting RCP joint hydrostatic infiltration. The hybrid gradient boosting classification (GBC)-DBSMOTE model achieved superior predictive accuracy in terms of several classification indicators, with promising capability to create RCP joint hydrostatic infiltration performance charts that capture the effects of key design parameters, such as pressure duration and level, pipe size, and gasket sealing. The robust predictive model could produce design charts that aid municipalities in proactively averting sewage system infiltration problems at low cost, instead of the prevailing reactive approach to this problem.