# QSPR models for predicting the adsorption capacity for microplastics of polyethylene, polypropylene and polystyrene

Sep 3, 2020

### QSPR models for the adsorption of PE

Three QSPR models of log Kd were developed for the adsorption of PE in seawater, freshwater and pure water, respectively:

begin{aligned} {text{Seawater:}}quad log K_{{text{d}}} & = , left( {0.725 , pm , 0. , 058} right) , times , log D + , left( { – 36.236 , pm , 9.034} right) , times varepsilon_{alpha } \ & quad + , left( { – {23}.{169 } pm { 4}.{5}0{1}} right) , times varepsilon_{beta } + , left( {{17}.{856 } pm { 2}.{572}} right) \ end{aligned}

(1)

$${text{Freshwater:}}quad log K_{{text{d}}} = left( {0.667 , pm , 0.047} right) times log D + , left( {1.714 , pm , 0.302} right)$$

(2)

$${text{Pure}};{text{water:}}quad log K_{{text{d}}} = left( {0.449 , pm , 0.041} right) times log D + , left( {0.265 , pm , 0.115} right) , times M_{{text{w}}}^{prime } , + , left( {1.855 , pm , 0.302} right)$$

(3)

where log D is the n-octanol/water distribution coefficient at special pH value, εα is the covalent acidity, εβ is the covalent basicity and Mw is the relative molecular mass. As shown in Williams plot for model (3) (Fig. S1 of the Supplementary Information, S1), 17α-ethinyl estradiol obtained an absolute SR value (− 3.392) larger than 3 and it was diagnosed as an outlier. Structural analysis showed that 17α-ethinyl estradiol is significantly different from other compounds due to its acetylene group and steroidal ring (unsaturated benzene ring connects with saturated six-membered ring). Such discrepancy may be the main cause of predictive inaccuracy. After removing it, the following model was yielded:

$${text{Pure}};{text{water:}}quad log K_{{text{d}}} = left( {0.486 , pm , 0.035} right) times log D + , left( {2.420 , pm , 0.199} right)$$

(4)

The statistical parameters of the developed QSPR models are presented in Table 1. For the models (1), (2) and (4), R2 = 0.868, 0.903 and 0.811, Q2 = 0.868, 0.903 and 0.811, and RMSE = 0.826, 0.686 and 0.612, respectively. The statistical results indicate that the models have high goodness-of-fit. As shown in Table S1, all the VIF values (1.000–1.204) are less than 10, indicating there is no multicollinearity for the three models. The fitting plots (Fig. 1) state a good consistence between the experimental and predicted log Kd values. As shown in Fig. 2, the distributions of predictive errors show no dependence on experimental log Kd values. Thus, the developed models have no systematic error, which is also proved by BIAS = 0.000–0.001 (Table 1).

For the simulated external validation, the redeveloped QSPR models (S1S3) based on 70% experimental data and the same descriptors in model (1), (2) and (4) show similar fitting performance (including R2, Q2, RMSE and MAE) and regression coefficients with the models developed by the whole dataset (Table 1). Thus, the models are statistically stable. As the training subsets are randomly assigned, there is no casual correlation. The predictive performance of each rebuilt model to the test set (30% subset, shown by the superscript of b in Table 2) are listed in Table 1. The values of Q2, RMSE and MAE indicate excellent predictive quality of the developed QSPR models. The results of leave-one-out cross validation (Q2CV = 0.882–0.940) also show a good robustness and internal predictivity.

Williams plots were employed to test the application domain of the QSPR models (1), (2) and (4). The calculated alert value h* are 0.324, 0.250 and 0. 128, respectively. As shown in Fig. 3, there are three (oxytetracycline, sulfadiazine and δ-hexachlorocyclohexane), and one (2,2′,3,3′,4,4′,5-heptachlorobiphenyl) compounds located at the right side of h* for models (1) and (4), respectively. As their absolute SR values are < 3, these chemicals are not diagnosed to be outliers. In summary, these results indicate the developed QSPR models have excellent generalization capabilities in their descriptor matrix. Given the molecular structures for developing models, QSPR model (1) can be used to predict the log Kd values of organics including polychlorinated biphenyls, antibiotics, polycyclic aromatic hydrocarbons, chlorobenzenes, perfluorinated compounds and hexachlorocyclohexanes between PE and sea water; model (2) can be employed for predicting the log Kd values of polychlorinated biphenyls and antibiotics between PE and fresh water; model (4) can be performed to predict the adsorption of PE in pure water towards organic pollutants such as polychlorinated biphenyls, antibiotics, polycyclic aromatic hydrocarbons, chlorobenzenes, aromatic hydrocarbons and aliphatic hydrocarbons.

The n-octanol/water distribution coefficient at special pH value (log D) was selected for all the three log Kd predictive models for PE in seawater, freshwater and pure water. The experimental log Kd values significantly correlate with log D, which yields positive correlation coefficients (0.725, 0.667 and 0.486) in models (1), (2) and (4). Thus, the organic pollutants with high hydrophobicity will prefer to be adsorbed onto the PE. For example, hydrophobic polychlorinated biphenyls (PCBs) with large log D values exhibit higher log Kd values than ionizable organic pollutants (e.g., antibiotics). This is because the hydrophobicity of PE itself makes hydrophobic interaction as the main mechanism in the adsorption of PE towards organic pollutants. The same adsorption mechanism was also confirmed by Hüffer et al. who established prediction model based on the log Kow values of seven organic compounds30.

For the adsorption of PE in seawater, εα and εβ, which respectively represents covalent acidity and covalent basicity, were also selected. The quantum chemical descriptor of εα shows a negative contribution to the log Kd values, suggesting that organic pollutant with large εα value prefers to dissolve in water, leading to a decrease in log Kd. That means the surface of PE has a weaker H-accepting ability to organic pollutants than water at the adsorption interface31. Similarly, the log Kd values increase with decreasing εβ, indicating that the H-donating ability of the PE surface is also weaker than water. It follows that hydrogen bond interaction is also an important mechanism for the interactions between PE and organic pollutants in sea water.

Compared with fresh water and pure water, the high salinity of seawater can enhance the dipole–dipole and dipole–induced dipole interactions in the system, which can make hydrogen bonds form easily. As a result, εα and εβ play more important role in the log Kd value of PE for seawater. In brief, the distribution behavior of the studied organics between PE and water is mainly affected by the hydrophobic interaction. For the adsorption in seawater, hydrogen bond interaction is another important driving force.

### QSPR model for the adsorption of PP

A QSPR model of log Kd was yielded for the adsorption of PP in seawater:

$${text{Seawater:}}quad log K_{{text{d}}} = left( {0.751 pm 0. , 035} right) times log D + left( { – 19.323 pm 2.072} right) times varepsilon_{beta } + left( {6.735 pm 0.663} right)$$

(5)

Values of R2, Q2, and RMSE are 0.939, 0.939 and 0.381, respectively. Thus, the model (5) show great goodness of fitting and can explain 94% variability of the whole dataset. The nonlinearity of model (5) has been proved by the VIF values (1.034 for both descriptors, Table S1). As shown in Fig. S2, the predicted log Kd values show good consistence with their experimental values. The Fig. S3 and BIAS value (− 0.003) proved that there is no dependence of predictive errors on experimental log Kd values.

For the simulated external validation, the regression coefficients (R2 = 0.945, RMSE = 0.396 and MAE = 0.307) and statistical parameters of the training subset are similar to that of the whole dataset (Table 1 and model S4). Thus, model (5) is statistically stable and there is no casual correlation. As shown in Table 1, the high prediction quality of the developed QSPR model can be proved by the predictive performance of the new model (Q2 = 0.874, RMSE = 0.369 and MAE = 0.228) to the test subset. Furthermore, model (5 has good robustness and internal predictive ability (Q2CV = 0.957). The Williams plot for the applicability domain of model (5) (Fig. S4) shows that there are two compounds (sulfadiazine and γ-hexachlorocyclohexane) located at the right side of h* (0.257). While, these two compounds yield absolute SR values < 3, indicating they are not outliers. Thus, model (5) can be used to predict the log Kd values of PE in seawater towards the organics including polychlorinated biphenyls, chlorobenzenes, hexachlorocyclohexanes, polycyclic aromatic hydrocarbons and antibiotics.

For the adsorption of PP in sea water, log D and εβ were also selected in model (5). Thus, hydrophobic interaction and hydrogen bond interaction also play determining roles in the adsorption. However, unlike the log Kd predictive model of PE in seawater, the εα representing the covalent acidity is not selected in model (5). Such dissimilarity may come from the addition of methyl groups in the PP structure that reduces the difference of H-accepting ability between the microplastics and water, consequently resulting in a negligible contribution of εα in the adsorption of PP.

### QSPR model for the adsorption of PS

For the adsorption of PS in seawater, the experimental log Kd values of 28 organic pollutants (of which 14 are ionizable compounds) were used to established predictive model:

$${text{Seawater:}}quad log K_{{text{d}}} = left( {0.357 pm 0. , 062} right) times log D + left( {3.766 pm 0.384} right) times pi + left( { – 2.080 pm 0.540} right)$$

(6)

As shown in Tables 1 and S1, the obtained statistical parameters (R2 = Q2 = 0.837) prove a good regression performance and the calculated VIF values (1.000 for both descriptors) prove no multicollinearity of model (6). Meanwhile, the favorable consistence between the experimental and predicted log Kd values was observed in Fig. S5. The pattern of predictive errors shown in Fig. S6 reveals no systematic error for model (6), which is also verified by BIAS = 0.000 (Table 1).

Based on the training subset (70%), similar regression coefficients and statistical parameters of the new model (S5) were obtained (Table 1). The comparable statistics were also received for the test set. Moreover, Q2CV value (0.898) of the leave-one-out cross validation was obtained, higher than the acceptable criteria. Thus, model (6) has satisfactory robustness and internal predictive ability. As shown in the Fig. S7 of Williams plot, three compounds (fluoranthene, chrysene and pentacosafluorotridecanoic acid) with ׀SR׀ < 3 locate at the right side of h* (0.321), indicating that they are not outliers. In conclusion, model (6) can be employed for predicting the adsorption carrying capacity (log Kd) of PS for organic pollutants (especially for ionizable organic pollutants) within the application domain in seawater. In previous study20, the influence of dissociation on log Kd for ionizable organic pollutants was not considered in the construction of predictive models. In fact, the physicochemical properties (e.g., hydrophobicity) of various dissociation species are quite different, which may significantly affect the partition of ionizable organic pollutants between PS and seawater. Therefore, the predictive models established without considering the effect of pH on the distribution of dissociation species is only applicable to predict log Kd values under the experimental water pH. However, the QSPR model (6) constructed in this study can expand the predictive application to various pH values. Limited by the number of ionizable compounds and pH range used for model construction, the developed models are more suitable for the pH range of natural waters (6–9).

The presence of log D in model (6) proves that hydrophobic interaction also can enhance the adsorption of organics on PS in seawater. In addition to log D, π was also selected. The experimental log Kd values positively correlate with π (3.766) in the QSPR model, indicating that chemicals with larger π value preferred to be adsorbed onto PS in seawater. As shown in Tables 2 and S2, the organic compound, which contains strong π–electron conjugation in the structure, generally has a large π value. Thus, it can be inferred that the π − π interaction also contributes to the adsorption for PS. The phenyl groups in the PS structure produce higher π–π interactions with organic chemicals than PE and PP, thus yielding higher log Kd values (Table 2). For example, the log Kd value of phenanthrene onto PS (5.50) is much higher than that on PE (4.440) and PP (4.000) in sea water. In brief, hydrophobic interaction and π–π interaction play important roles in the adsorption of PS in sea water.