Earth System Model large ensembles
This study uses simulations from the US Climate Variability and Predictability programme’s newly developed data archive of large initial-condition ensembles with Earth System Models (hereafter ESM-LE17). All ESM-LE simulations utilised similar CMIP5 forcings, which include anthropogenic and volcanic aerosols, solar radiative forcing, and greenhouse-gas (GHG) concentrations that represent historical radiative forcing up to 2005 and the Representative Concentration Pathway 8.5 thereafter18. For this study, we use only models that provide at least 30 members, namely CANESM2, CESM1, CSIRO, MPI, and GFDL-ESM2M (hereafter simply GFDL; see Supplementary Table 1 for details about the ensembles). The large number of members of each single-model ensemble allows us to effectively isolate the forced component from the internally generated component of variability in each model, while the comparison between historical (1950–2010) and future (2010–2070) periods allows us to investigate fundamental differences in the forcing used in the two periods (e.g., the absence of volcanic forcing in the latter period).
Forced GMSST decadal variability
To place our findings in the context of the IPO, we analyse SST rather than surface air temperature. While results using either field are almost identical, this choice facilitates the comparison with Pacific climate modes that are typically calculated from SST. To isolate the forced component of the decadal-scale variability of global mean SST (GMSST) in each ESM-LE model, we compute ensemble means of 8-year low-passed time series of area-weighted GMSST by averaging across each single-model ensemble (Fig. 1a). Due to the lack of long-term reliable observations in polar regions and before 1950, we limit our analysis spatially to the near-global domain, here defined as regions between 40°S and 60°N19, and temporally from 1950 onward. While the amount of warming reached toward the end of the twenty-first century varies among models (Fig. 1a), the GHG-induced long-term trend in each model is well approximated by a quadratic function, especially after the year 2000 (Supplementary Fig. 1). Since we are interested in decadal-scale fluctuations of GMSST, we subtract the quadratic fit to the ensemble mean from each GMSST ensemble member to obtain residual time series (GMSSTr). These residual time series highlight variability largely independent of centennial length anthropogenic change (Fig. 1b). Moreover, the removal of a centennial-scale quadratic trend does not significantly affect the shorter decadal-scale fluctuations targeted in this study, apart from possible effects at the edges of the study period (i.e., 1950 and 2070).
Two clearly distinct periods are identified in Fig. 1b: from 1950 to 2010, when models show decadal-scale fluctuations that are largely coherent with each other, and from 2010 to 2070, when models show incoherent variability with model spread more closely centred around zero (coloured shading in Fig. 1a–c). The observed GMSSTr trajectory lies largely within the ESM-LE model spread envelopes and presents some similarities with GMSSTr ensemble means from about 1975 to 1995, with the lack of full agreement largely due to the internal variability, but perhaps also due to model deficiencies in representing the external forcing20 (discussed later). Given the influence of GHG forcing has largely been removed via the quadratic fit, these results suggest that non-GHG forcing is responsible for synchronising decadal-scale variability in ESM-LE ensemble means, and potentially the observations, over the historical period (1950–2010). Despite the consistency, there are important differences between the models and observations. Neglecting any residual internal variability in the ensemble mean, the correlation between the ensemble mean and each individual member gives a direct measure of the GMSSTr forced component for each model (Supplementary Fig. 2). Depending on the model, the forced component explains between 30 and 58% of the time-evolving variance (explained variance obtained as correlation squared) in the historical period and 8 to 18% in the future period, with the remaining variance associated with unforced internal variability. However, three out of five models show correlations between observed and ensemble-mean time series outside the range of modelled internal variability (i.e., outside the range of correlations between the ensemble mean and individual ensemble members). This may be a result of models underestimating the range of natural variability, but, as we show below, it is likely to also indicate an overestimation of the forced component in the historical period (e.g., Supplementary Fig. 3).
In addition, we estimate the fraction of forced decadal-scale variance (FDV) in GMSSTr by comparing the power spectrum of the ensemble-mean GMSSTr time series, which represents the externally forced variance at each frequency (thick coloured curves in Fig. 2), with the average power spectrum of each ensemble member’s GMSSTr time series, which represents the sum of internally and externally forced variance (i.e., total variance) at each frequency (thick grey curves in Fig. 2). Specifically, integrating both power spectra for periods between 8 and 30 years (i.e., decadal-scale variance) reveals that the FDV in each ESM-LE accounts for 29–53% of the total GMSSTr variance of the historical period (Fig. 2a, c, e, g, i). In contrast to the historical period, the FDV in the future period is greatly reduced in all models as shown by the nearly flat power spectrum of the ensemble mean GMSSTr (Fig. 2b, d, f, h, l) and the smaller fraction (i.e., 3–6%) of decadal variance in GMSSTr accounted for by FDV for the years 2010–2070.
Furthermore, larger oscillations in ensemble mean and model envelopes (solid lines and colour shading in Fig. 1b, respectively) in the historical period compared to the future period suggest that the total decadal-scale variance is also larger in the former period. In fact, the power in the GMSSTr ensemble-mean time series between 8- and 30-year periodicity over the historical period (1950–2010) is between 1.8–4.5 times larger than in the future period (2010–2070). This indicates that external forcing provides an additional source of decadal-scale GMSST variance in the historical period. However, it must be noted that the observed power spectrum (black lines in Fig. 2) for periods between 8 and 30 years is lower than almost all ensemble members (grey lines in Fig. 2), independently of the model. While it is possible that the observed GMSSTr trajectory is a rare event even with these multiple large ensembles, the discrepancy is more likely a symptom of the model tendency to overestimate internal and/or non-GHG forced decadal variability (e.g., Supplementary Figs. 2 and 3).
Role of aerosols, GHG, and biomass burning in decadal variability
A possible cause of the distinct character of past and future decadal-scale variability in ESM-LE appears evident when the width of the GMSSTr low-pass filter is reduced to 1 year (Fig. 1c). Year-to-year variations in GMSSTr reveals three abrupt drops in temperature that coincide with major volcanic eruptions: Agung in March 1963, El Chichón in April 1982, and Pinatubo in June 1991 (vertical lines in Fig. 1b, c). While it is well known that stratospheric aerosols associated with explosive volcanic eruptions can have short-term cooling effects on global mean temperatures in observations21,22 and models23, and likely contributed to the decadal-timescale slowdown in the rate of global mean temperature warming in the early 2000s24, results here demonstrate their important role as drivers of historical decadal-scale GMSST variability in current state-of-the-art climate models (i.e., ESM-LE). The temporal spacing between these strong drops in temperature, 19 and 9 years, combined with a recovery time (i.e., time to dissipate the cold anomaly) of 5–8 years25 (Fig. 1c), creates a climate signal with a strong projection on decadal-scale variability.
The key role of volcanic eruptions for FDV is indirectly confirmed from CESM1 single-fixed-forcing experiments26 in which concentrations of either GHGs, tropospheric inorganic aerosols, or organic aerosols associated with biomass burning are fixed to their 1920 values. All other forcing components, including volcanic forcing, use time-dependent values as in the full-forcing CESM1 simulations of ESM-LE. Comparing full-forcing with single-fixed-forcing ensembles reveals only minor changes in the ensemble mean of 8-year low-passed GMSSTr (Fig. 3), with correlation coefficients as high as 0.86, 0.88, and 0.62, for fixed GHGs, fixed tropospheric inorganic aerosols, and fixed biomass-burning aerosols, respectively.
All simulations include time-dependent volcanic forcing, and all simulations exhibit coherent decadal-scale variability of GMSST.
A direct estimate of volcanic-driven GMSST variability requires large ensembles of simulations in which the volcanic forcing is either the only one present (i.e., volcanic-forcing-only experiment) or the only one excluded (i.e., all-but-volcanic-forcing experiment). While such experiments are unavailable in CESM-LE, results from smaller ensembles (four and five members) with a similar version of CESM are consistent with the hypothesised role for volcanic forcing (Supplementary Fig. 4).
While fixing biomass-burning aerosols seems to have the largest impact on GMSSTr trajectories (least correlation with the full-forcing experiment), it must be noted that the fixed biomass-burning ensemble has only 15 members, and thus the ensemble mean retains more random internal variability than in the other single-fixed-forcing ensembles that have 20 members. It is noteworthy that fixing tropospheric anthropogenic aerosol concentrations to their 1920 value results in larger amplitude fluctuations in GMSSTr, suggesting that higher tropospheric aerosol concentrations typical of the second half of the twentieth century have a damping effect on the volcanic forcing. While this needs further confirmation in other models, it suggests that anomalies in stratospheric aerosol concentrations associated with volcanic eruptions have a larger impact in a cleaner atmosphere; determining the reasons for this is left for future work.
Comparing GMSST anomalies after each major volcanic eruption suggests a tendency for a larger-than-observed response in ESM-LE models during the 1982 and 1991 events (Supplementary Fig. 3). A similar discrepancy in CMIP5 simulations has been reconciled by taking into consideration the concomitance of these eruptions with El Niño events; the positive anomaly in GMSST associated with the El Niño partially offsets the cooling associated with volcanic eruptions in observations but not in simulations27. Here with 30 members for each model we see that in most ESM-LE simulations, and in all members from CANESM2 and GFDL ensembles, the magnitude of the global temperature response to the 1982 and 1991 volcanic events is overestimated regardless of the El Niño-Southern Oscillation phase (Supplementary Fig. 3). Furthermore, models that likely overestimate the volcanic response, CANESM2 and GFDL, present also the highest percentage of decadal variability accounted for by FDV (53%; Fig. 2a, l), suggesting that these models might overestimate the externally forced fraction of decadal variability, which is therefore closer to the lower boundary of the range estimated in ESM-LE (i.e., 29–53%).
Forced variability of the IPO
While we have focused on the temporal variability in GMSSTr, the spatial expression of GMSSTr during 1950–2010 uncovers an important characteristic of the FDV in both ESM-LE and observations. Specifically, the ensemble-averaged regression map of GMSSTr onto 8-year low-passed SST reveals a strong resemblance to the IPO pattern (Fig. 4a), which results in a spatial correlation with the IPO pattern (see Fig. 4 caption for IPO definition) of 0.42 in the observations and of 0.76, 0.80, 0.75, 0.87, and 0.52, in the ensemble mean of CANESM2, CESM1, CSIRO, MPI, and GFDL, respectively. Consistent with this spatial correlation, GMSSTr time series are also significantly correlated with the IPO index in several ESM-LE simulations (Supplementary Fig. 5). While the spatial and temporal correlation between GMSSTr and the IPO over the historical period may simply confirm the known contribution of IPO variability to twentieth century global surface temperature fluctuations7, it may also indicate the presence of an externally forced component in the IPO variability resulting from a partial synchronisation of the internal variability.
The possibility of a forced component in the IPO variability12,28 finds partial support in the variance associated with the IPO computed from ensemble mean SSTs as a function of the ensemble size (Fig. 4c). Assuming the IPO to be purely a result of internal variability, one would expect this IPO variance to decrease in proportion to the number of ensemble members, n (see “Methods”). However, in CANESM2 and CSIRO, and partially in GFDL, the fraction of variance linked to the IPO during 1950–2010 decreases at a lower rate and retains a stable residual variance (indicated in Fig. 4c as RV) of 15%, 13% and 7%, respectively, of the initial value (i.e., variance for n = 1) for n > ~20. While this sizeable residual in CANESM2 and CSIRO is a clear indication of a forced component of the IPO during the historical period, repeating the exercise for the years 2010–2070 shows a reduction of variance that approximates well the expected 1/n decrease in all five ensembles, suggesting that virtually all the IPO variability in the future period is internally generated.