AbstractThe potential for stacking ensemble modeling to enhance the performance and generalizability of machine learning (ML) models for the estimation of total suspended solids (TSS) concentration was assessed by comparing the results with ensemble boosting, bagging, and single ML models. Seven stacking ensemble models (M1 to M7) were created using combinations of basic learners, including single, bagging, and boosting models. Adaptive Boosting (AdB) was used as an aggregation method in M1 to M6. The six models showed coefficient of determination (R2) values ranging from 0.87 to 0.95, root mean square error (RMSE) values ranging from 50 to 90  mg/L, and mean absolute error (MAE) values ranging from 11 to 86  mg/L where the best R2, RMSE, and MAE values were 0.95, 50  mg/L, and 12  mg/L, respectively. To further improve the predictions, we tested aggregation methods, including AdB, Random Forest (RF), Variable Weighting kNN (VW-kNN), Regression Tree (RT), and Support Vector Regression (SVR) using the structure of the highest-performing M6 model. This led to a new best fit model (M7) with RF as an aggregation method with R2, RMSE, and MAE values of 0.98, 32  mg/L, and 11  mg/L, respectively.

Source link

Leave a Reply

Your email address will not be published.