Assembling and monitoring the synthetic bacterial communities
We selected three bacterial strains, a Chryseobacterium sp., a Staphylococcus sp., and a Bacillus sp., from a large collection of soil isolates, and we screened them with fluorescence-independent flow cytometry as we did previously . We measured at an acquisition speed of 14 μl min−1 for 2 min per sample and setting a threshold of 10,000 regarding the height signal of the front scatter (FSC-H). Differently than in our previous work, here we acquired all scattering profiles based on growth assays at 30 °C that was the temperature at which we performed all the experiments. In addition, we screened the growing cultures with a temporal resolution of 20, rather than 30, min. We recorded significant interactions among the strains by comparing their single and mixed growth profiles at 30 °C (Dataset 1). These interactions were mainly positive, similarly to what we found previously at other temperatures . We performed all the related growth assays in biological triplicates. All flow cytometry data are available in .fcs format online (http://flowrepository.org) under the “FR-FCM-Z25Q” identifier. Henceforth, when referring to the experiments we use the term “population” to describe the cells of a given strain within a flask at a given assay and the term “community” to describe the total bacterial cells within a flask at a given assay.
Quantification of “background noise”
Our flow cytometry method for screening the mixed bacterial cultures has an accuracy of 97% for sample densities above 105 cells ml−1 . However, at lower densities sampling errors and instrument inconsistencies become increasingly important because the signal-to-noise ratio drops. This can result in substantially different counts among identical samples and can artificially inflate the observed variability. Thus, it was essential to quantify this “noise” before performing the main experiments and subtract it from the observed variability when quantifying drift.
To that end, we made a series of separate experiments to quantify “noise”. In these experiments, we mixed overnight cultures of the three strains in all the seven possible combinations and in final cell densities ranging from 1.6 × 104 to 6.3 × 107 strain−1 ml−1 (corresponding to the expected range of cell densities in the main experiment, see below), and we measured repeated aliquots from the same flask to determine the coefficient of variation (CV—Fig. 1a). We treated the samples in exactly the same way as in the main experiments to include the effect of sampling errors in our calculations. We acquired in total of 99 triplicate measurements of 1–3 populations for a total of 148 observations (Fig. 1a, Dataset 2). We hypothesized that the level of “noise” should be inversely related to the cell density of the sample, because the signal-to-noise ratio decreases at low cell densities in the flow cytometer. Accordingly, we fit different functions for the dependency of CV to cell density (Supplementary Table S1). Finally, we calculated the 99.5% confidence intervals of the best-fitting function (i.e., Michaelis-Menten) using the confint function of the MASS package  in R , and we defined the false discovery rate based on the number of observations that were above the upper 99.5% confidence interval (Supplementary Fig. S1). Finally, we verified that the levels of noise determined in this study are similar to the variability recorded from technical replicate samples taken during our previous experiment where we used the same bacterial system and instrument with identical settings  (Supplementary Fig. S2).
To quantify drift, we monitored the changes in population densities across identical starting communities incubated under the same environmental conditions (Fig. 1b). To that end, we mixed the three strains in all seven possible combinations, i.e., three monocultures, three mixed cultures of two strains and a mixed culture of all three strains, and in three different starting total cell densities (5 × 104 cells ml−1, 105 cells ml−1, and 106 cells ml−1). To perform each growth assay, we first inoculated the respective strains from overnight pure cultures in a single flask containing 300 ml of Luria–Bertani medium (Sigma). To reach the desired starting total cell density, we estimated the cell density of the overnight pure cultures with flow cytometry  and we inoculated the respective volume. We then mixed the culture thoroughly and we immediately split the volume equally into three flasks. We next sampled 500 μl from each flask and we compared the variability in the bacterial populations across the three flasks to the expected “background noise” for the same cell density. In specific, we examined the CVs of the bacterial populations and their z-scores compared to the “background noise”, i.e., how many standard deviations an observed CV differs from the expected “background noise” CV at a given cell density. If the observed z-scores were larger than 2 (95% CI), we aborted the given experiment because it indicated that we introduced variability when we mixed and split the cultures and thus the starting cultures could not be considered identical; this happened in ~50% of the cases. If the observed z-scores were lower than 2, indicating that the recorded variability was not statistically different or was less than the expected variability based on the “background noise”, we proceeded with the experiment, incubating the three flasks in the same chamber (New Brunswick Innova 42R, Eppendorf) at 30 °C and with shaking at 80 rpm.
We recorded the starting densities (Dataset 3, z-scores between −6.68 and 1.17) and the densities every 20 min until the end of the fourth hour of incubation starting from the 60th minute. To detect and quantify drift, our main assumption was that any larger-than-expected deviations in the population densities of identical starting communities incubated under the same environmental conditions could only be because of drift. Thus, we compared the observed CVs to the expected CVs based on the “background noise” by deploying the z-score. We quantified drift using two different thresholds:
The “upper threshold” that focused on excluding false-positive observations. In this quantification, we used a cutoff significance level of z > 3 (99.5% CI) and we ignored the lowest 17.57% of positive observations (i.e., 15 observations, corresponding to the FDR level of the “background noise”) to minimize the detection of false positives.
The “mean threshold” that focused on excluding false negatives and increasing detectability. In this quantification, we used a cutoff significance level of z > 0, meaning that we scored any observation greater than the mean noise function as positive.
The “upper threshold” quantification probably overestimates drift by taking into account only the highest among the recorded CV values while the “mean threshold” quantification underestimates drift by taking into account some low CV values that are very close to the noise levels. Thus, the “upper threshold” and “mean threshold” quantifications do not represent the true levels of drift (which are hard to define whatsoever in the presence of noise) but they rather represent the upper and lower boundaries within which the true levels of drift lie.
Estimating potential growth variability due to temperature differences within the incubation chamber
To ensure that the recorded variability in the population counts was not due to slight differences in temperature within the incubation chamber, we estimated the potential variability that could have resulted if each strain grew within the extremes of the recorded temperatures in the chamber. For that, we first measured the temperature within each flask at each experimental time point, five times per flask, using a digital immersion thermometer with an accuracy of 0.1 °C. The temperature varied by 0.15 °C ± 0.08 °C on average and by 0.28 °C at maximum. We then calculated the growth rates of each of the strains under the recorded temperature extremes at each time point by interpolating from previously recorded growth rates . We interpolated both with respect to time and with respect to temperature because the previous data were recorded at intervals of 30 min and 0.5 °C (the latter at a range of 25–42 °C). Finally, for each strain, we calculated how the CV in the hypothetical population densities would increase if the strains were constantly growing within the recorded temperature extremes for the duration of the experiment and if the CV was calculated from three observations (like in the real experiments) of the resulting population density distributions. We note that with this analysis, we probably overestimated the hypothetical increase in population densities because we used growth rates from mixed cultures that were generally higher than those in monocultures because of the positive interactions among the strains (Dataset 1).
To simulate drift in complex bacterial communities, we used in silico communities with diversity and abundance distributions similar to nature  where drift acts with magnitude according to our experimental data. A conceptual flowchart of the simulations can be found in Supplementary Information (Supplementary Fig. S3). Each simulation involved a metacommunity of 100 communities that were connected with dispersal and that initially contained 2000 species each.
We simulated dispersal occurring in a unidirectional way within a closed system; individuals from community n disperse to the community (n + 1) and individuals from community 100 disperse back to community 1. The strength of dispersal equaled to the percentage of individuals that disperse to the respective community and it varied between 2 and 20%. Our aim in modeling dispersal in this way was to create a setting where habitat fragmentation was high and therefore drift’s importance is expected to be more pronounced [21, 22], and where there was no gain or loss of individuals from outside the metacommunity.
We simulated selection as differences in the growth rates among species within a community. The growth rates were distributed normally with a mean of 1 (resembling systems at their carrying capacity) and with a standard deviation between 0.071 and 0.167. Growth rates changed at every generation by being re-drawn from the same distribution in an effort to represent fluctuating habitats where a given species is not always favored or disfavored. Therefore, in our simulations, the standard deviation in the growth rates represents the strength of selection, because the higher it is the bigger are the differences in the growth rates in a community and the changes in the growth of a species from generation to generation. The distribution of the abundances in a community at time zero was log-normal (mean = 4, sd = 1.1) and the distribution of the abundances of a given species across all communities was normal with a standard deviation equal to the strength of selection.
The metacommunity grew for 1000 generations under given dispersal and selection conditions with drift, where drift changed the assigned growth rates at every generation according to a distribution based on the defined threshold from the experimental data (“upper” or “mean” threshold). In parallel, an initially identical metacommunity grew under the same dispersal and selection conditions but without drift, meaning that the assigned growth rates at every generation did not change further. More details and an example on how we modeled changes in growth rates due to drift are presented in the Supplementary Text in Supplementary Information.
For a given generation, we calculated the effect of drift by comparing a given community in the metacommunity that grew under drift to the same community in the metacommunity that grew without drift. In specific, we examined the β-diversity by means of the Bray–Curtis (BC) community similarity and the differences in species richness and in Pielou’s evenness among drift-impacted and drift-free communities, calculating the metacommunity-wise mean and standard deviation on all these properties. Moreover, we kept track of the extinct species at the end of each simulation and we mapped their initial relative abundances, but here we report the metacommunity-wise median because the distribution of the relative abundances is skewed (Supplementary Fig. S4). We ran simulations under 50 different scenarios resulting from five levels of selection strength over ten levels of dispersal rate. To estimate the effect of drift on Bray–Curtis similarity in metacommunities with increasing number of species, we ran the same simulation at the highest selection and lowest dispersal levels, at intermediate selection and dispersal and at the lowest selection and highest dispersal, but we changed the number of species; we ran the simulation three times in metacommunities of 500, 1000, 2000, 4000, 6000, 8000, and 10,000 species. We performed all simulations in R. All code is available on GitHub (https://github.com/sfodel/Drift).
Reported β-diversity in stochastically assembled communities in nature
To compare our simulation results with the results from environmental surveys regarding the β-diversity in stochastically assembled communities, we searched for related studies using the following two criteria: (1) the study cites the works of Stegen and colleagues [12, 13], where the term “undominated” community assembly is presented formally for microbial ecology, (2) the study reports data on the range of the observed β-diversity in terms of Bray–Curtis dissimilarity (or similarity) in stochastically assembled communities, or this range can be inferred from the data presented in that study.