### Construction of the five-compartment model

Here we explain the construction of the five-compartment model (Fig. 1a). This is step 1 among the four steps described in “Results” section (subsection “Ecosystem dynamics at microsite scale”). We use upper bars in our initial notations to indicate parameters prior to rescaling.

The five-compartment model captures the stochastic processes acting at the level of *C*, *D*, *M*, *Z*, *X* entities (molecules, cells) (Fig. 1a) within a microsite. Dynamics of *C*, *D*, *M*, *Z*, *X* occur in continuous time. *M*_{t} is the number of cells at time *t*. *C*_{t}, *D*_{t}, *Z*_{t} are the numbers of SOC molecules, DOC molecules, and enzyme molecules respectively. *X*_{t} is the number of complexes formed by an enzyme molecule binding a SOC molecule. There are constant external sources of SOC and DOC. When a cell dies, a fraction *p* of the molecules released are recycled into SOC, while the rest is recycled into DOC. A fraction *l* of dead microbes and deactivated enzymes may be lost due to leaching.

We denote by *α* the structural cost of a cell, which is the equivalent in number of DOC molecules of one cell (without storage). We denote by (alpha ^{prime}) the energetic cost of a cell, which is the number of DOC molecules consumed to produce the energy needed for the synthesis reactions involved in the production of one cell. We denote by *β* the equivalent in number of DOC molecules of one SOC molecule, and the structural and energetic cost of producing one molecule of enzyme by *ρ* and (rho ^{prime}), respectively. We assume that the energetic costs are carbon released by cells as CO_{2} (cell respiration) that diffuses out of the system instantly. We define the biomass production fraction and enzyme allocation fraction as

$${bar{gamma }}_{M}:=frac{1}{alpha +alpha ^{prime} },quad {bar{gamma }}_{Z}:=frac{1}{rho +rho ^{prime} }.$$

(1)

The event times are given by independent exponential random variables whose parameters are defined by event rates (Supplementary Tables 2–4). These event rates give an approximation of the average frequency of each event. The rates of cell growth and enzyme production depend on the trait *φ*. Once a cell has doubled its initial size, reproduction occurs by releasing the mother cell at its initial size, and the daughter cell at its same size. The cell must therefore take up and store its structural and energetic cost, ((alpha +alpha ^{prime} )), in DOC molecules in order to reproduce. We denote *N* the number of uptake events before reproduction. The number of DOC molecules taken up at each uptake event is then ((alpha +alpha ^{prime} )/N), hence the notation ({{bf{1}}}_{{Dge (alpha +alpha ^{prime} )/N}}) which equals 1 if (Dge (alpha +alpha ^{prime} )/N), and 0 otherwise. The same notation, ({{bf{1}}}_{{Dge rho +rho ^{prime} }}), is used for the production event of an enzyme molecule. Uptake is stochastic, but reproduction is deterministic, which means that when a cell has performed *N* uptake events, it reproduces with probability 1. A larger *N* means a larger number of uptake events between 2 reproduction events, which also means less DOC molecules taken up at each uptake event. The model tracks the dynamics of the number of cells, SOC, DOC, enzyme molecules, and also of the DOC stored in each cell.

Enzyme–substrate complexes form at rate ({bar{lambda }}^{k}) as one enzyme molecule (e.g. cellulase) bind one SOC molecule (e.g. cellulose). A complex may either dissociate (with no decomposition) at rate ({bar{lambda }}_{-1}^{varepsilon }), or react at rate ({bar{mu }}^{varepsilon }) and convert the molecule of SOC into *β* molecule of DOC while the enzyme is released and free again to react with new molecules of SOC (Supplementary Table 3).

System size *k* does not appear in this system of equations, yet it enters the volume-dependent parameters of the model, *I*_{C}, *I*_{D}, *θ*, and *K*_{m}. We denote *V* the unit volume of soil that contains on average one microbial cell and the corresponding equilibrium of carbon mass of SOC, DOC and enzymes. The system size *k* is the number of well-mixed unit volumes in one microsite, which determines the number of cells sharing SOC, DOC, and enzymes. The volume of one microsite is therefore *k* × *V*. Increasing *k* amounts to increasing microsite volume, the number of cells sharing resources in one microsite, the amount of resources per microsite, and the volume-dependent parameters, such as the amount of SOC entering a microsite per unit of time, *I*_{C}. In our analysis, the unit volume *V* is fixed, and we vary *k* to investigate the effect of microsite volume on the system’s eco-evolutionary dynamics. With very large *k*, the hybrid model can be approximated by a fully deterministic model which takes the form of a system of four ordinary differential equations (see Supplementary Information 3.3 and Supplementary Fig. 1), similar to the microbial decomposition model first introduced by Allison et al.^{53}. However, empirical data suggest that *k* is of the order of 10–100^{54}. When *k* = 1, there is only one cell in the microsite, which volume is *V* defined as the unit soil volume expected to contain a single cell. A value of *k* greater than 1 means that each microsite contains *k* cells and *k* times the amount of SOC, DOC and enzyme molecules of 1 cell; thus, the microsite volume is *k* × *V*, and volume-dependent parameters are rescaled by *k*. Specifically, there are four volume-dependent parameters: the external input of *C*, ({bar{I}}_{C}^{k}), the external input of *D*, ({bar{I}}_{D}^{k}), the half-saturation constant of DOC uptake, ({bar{K}}_{m}^{k}), and the encounter intensity of two given SOC and enzyme molecules, ({bar{lambda }}^{k}). The external inputs increase proportionally with the volume of the microsite, while the encounter intensity of two given molecules in a microsite decreases as its volume increases. The half-saturation is inversely proportional to the affinity between a given cell *M* and a given DOC molecule *d*, which decreases with increasing microsite volume. We thus obtain the following scaling relationships:

$${bar{I}}_{C}^{k}=k{bar{I}}_{C}, {bar{I}}_{D}^{k}=k{bar{I}}_{D}, {bar{lambda }}^{k}=frac{lambda }{k} {rm{and}} {bar{K}}_{m}^{k}=k{bar{K}}_{m}.$$

(2)

In our simulations, we generally assume that *k* is equal to 10, to match the empirical observation that (cells) in soil habitat tend to interact with 10 to 100 other cells at all time^{54}.

### Derivation of the hybrid model

In Supplementary Information 3, we present the next two steps (2 and 3 in “Ecosystem dynamics at microsite scale” of “Results” section) to derive the hybrid model on which all our results are based. In Supplementary Information 3.1, we explain how the dynamics generated by the five-compartment model can be captured with a reduced model with four-state variables (step 2). In Supplementary Information 3.2, we explain how the stochastic-deterministic PDMP model can be derived from the stochastic four-state variable model (step 3). In the hybrid model (step 4), only cell death remains stochastic, and cell dynamics is measured in unit of number of individuals (*M*), while other entities are now in carbon mass unit. The rescaled SOC, DOC and enzyme abundances are denoted with lower case letters *c*, *d*, and *z*.

### Simulation algorithm for the hybrid spatial model

One technical benefit of the hybrid model is its much greater computational tractability. Here we describe the algorithm used to perform simulations of the hybrid model. We ran the model on single microsite to produce the simulations reported in Fig. 2. We ran the model on a 10 × 10 lattice of microsites for the subsequent figures. The algorithm is based on the Gillepsie algorithm^{55} as used in Champagnat et al.^{37}, Fournier and Méléard^{56}, which straightforwardly extends to the simulation of PDMPs.

To couple PDMP models across microsites, we account for the DOC and dispersal of cells between adjacent microsites. The DOC diffusion between microsites is modeled by approximating a continuous diffusion with a Euler scheme in which time is discretized with a fixed time step interval, *τ*_{diff}. *τ*_{diff} is chosen sufficiently small to provide a fine enough discretization of the DOC diffusion.

A simulation starts with a given amount of *M*, *z*, *c*, and *d* in each microsite at time *t* = 0, while the initial amount of DOC stored within each cell is determined uniformly at random. Two stochastic events (death of a cell) may not occur at the same time. Assume that the process has been computed until time *t*_{i}; to continue the computation to time *t*_{i+1}, we proceed as follows.

First, we simulate *T*, an exponential random variable with parameter *r*(*t*_{i}) = *d*_{M}*M*(*t*_{i}), which corresponds to the death rate of the total cell population at time *t*_{i} (*M*(*t*_{i}) being the total number of cells on the entire lattice). We then compute

$${t}_{i+1}:={t}_{i}+min left(T,{tau }_{{rm{diff}}}right).$$

To obtain *c*(*t*_{i+1}), *d*(*t*_{i+1}), and *z*(*t*_{i+1}) in each microsite at time *t*_{i+1}, and the variation in amount of DOC stored within a cell in the corresponding microsite, we use a Euler scheme that solves the dynamical system

$$left{begin{array}{l}hskip -136pt dot{c}(t)={I}_{C}-{l}_{C}c-theta zc,\ dot{d}(t)={I}_{D}-{l}_{D}d+theta zc+(1-l){d}_{Z}z-{V}_{max }frac{d}{{K}_{m}{,}+{,}d}{omega }_{M}M,\ hskip -84pt dot{z}(t)=varphi {gamma }_{Z}{V}_{max }frac{d}{{K}_{m}{,}+{,}d}{omega }_{M}M-{d}_{Z}z,\ hskip -93pt dot{Delta }(t)=(1-varphi ){gamma }_{M}{V}_{max }frac{d}{{K}_{m}{,}+{,}d}{omega }_{M},end{array}right.$$

in each microsite between *t*_{i} and *t*_{i+1}, where *M* is the number of cells in the microsite at time *t*_{i}, Δ gives the amount variation of DOC stored within a cell, Δ(*t*_{i}) = 0 and the other initial conditions are the biomass of *c*, *d*, *z* in the corresponding microsite at time *t*_{i}.

Note that, within a microsite, the variation of stored DOC is the same for all cells and corresponds to Δ(*t*_{i+1}). Hence, this amount is added to the amount of DOC stored within each cell living in the corresponding microsite. If, for a cell *j*, the resulting amount *S*_{j}(*T*_{i}) + Δ(*t*_{i+1}) is over *ω*_{M}, a new cell appears. The amount of stored DOC within the new cell and the mother cell is then updated: *S*_{j}(*t*_{i+1}) = (*S*_{j}(*T*_{i}) + Δ(*t*_{i+1}) − *ω*_{M})/2. Otherwise, *S*_{j}(*t*_{i+1}) = *S*_{j}(*T*_{i}) + Δ(*t*_{i+1}). To determine the position of the new cell, the following steps are taken:

A uniform random variable

*ϑ*_{1}in [0, 1] is simulated.If

*ϑ*_{1}< 1 −*p*_{disp}, the new cell is added to the mother cell microsite.Otherwise, the new cell disperses:

If empty microsites are available in the four nearest microsites, the new cell is added to one of them, drawn randomly.

Otherwise, a uniform random variable

*ϑ*_{2}in [0, 1] is simulated.If

*ϑ*_{2}< 1 −*p*_{open}, the new cell is added to the mother cell microsite.If

*ϑ*_{2}≥ 1 −*p*_{open}, a micro-disturbance happens. That is, one of the four nearest microsites is chosen at random and all cells in the microsite die. These cells are removed from the population, an amount of (1 −*l*)(1 −*p*)*ω*_{M}*M*is added to variable*d*and an amount of (1 −*l*)*p**ω*_{M}*M*is added to variable*c*in this microsite (where*M*is the number of cells that died in the event). Finally, the new cell is placed in this microsite.

If several birth events happen during the same time interval [*t*_{i}, *t*_{i+1}], the new cells are relocated one after another in a randomly and uniformly drawn order.

At this point, no diffusion of DOC has yet taken place. The next step in the algorithm therefore consists in calculating the diffusion of DOC as driven by a step of the Euler scheme associated with the diffusion equation

$$frac{{rm{d}}}{{rm{d}}t}d(x,t)={sigma }_{{rm{diff}}}Delta d(x,t).$$

This is done by updating the computed DOC biomass *d*_{j,l}(*t*_{i+1}) in the microsite of the *j*th column and the *l*th line and of the lattice by replacing it with

$${d}_{j,l}({t}_{i+1})+frac{{sigma }_{{rm{diff}}}cdot tau }{{(Vk)}^{2/3}} Big({d}_{j+1,l}({t}_{i+1})+{d}_{j-1,l}({t}_{i+1})+{d}_{j,l+1}({t}_{i+1})\ ,,, +,{d}_{j,l-1}({t}_{i+1})-4* {d}_{j,l}({t}_{i+1})Big).$$

Finally we test for a cell dying at time *t*_{i+1}:

If

*t*_{i+1}−*t*_{i}=*T*, then a cell dies at time*t*_{i+1}. It is chosen uniformly at random among all alive cells and it is removed from the population. At the same time, an amount of (1 −*l*)*p**ω*_{M}is added to variable*d*and an amount of (1 −*l*)(1 −*p*)*ω*_{M}is added to variable*c*in the corresponding microsite.If

*t*_{i+1}−*t*_{i}=*τ*_{diff}(i.e.*T*>*τ*_{diff}), no cell dies.

All steps are then repeated until a set time is reached or all cells died.

### Parameter values

We used default parameter values from previous modeling literature. Most existing models of microbial decomposition are deterministic^{28,29}; therefore, to compare them and their parameters tothe hybrid model, we established a fully deterministic version of our model, and then reversed the successive renormalizations to obtain the hybrid model’s parameters as functions of the parameters of the fully deterministic model. The derivation of the fully deterministic model, called cdmz, from the stochastic CDMZ model is presented in Supplementary Information 3.2 and takes the form of a system of differential equations, see equations (S.5). Mapping these equations to similar models in Allison^{53}, German et al.^{57}, Hagerty et al.^{58}, Schimel and Weintraub^{15} provided us with default parameter values listed in Table 1. In addition, the structural and energetic costs (*α*s and *ρ*s) are calculated from the masses (*ω*s) and production fractions (*γ*s) of the variables (see Supplementary Equation (S.3)).

The decomposition rate *θ* is calculated as (frac{{V}_{mathop{max }_{{rm{decomposition}}}}}{{K}_{{m}_{{rm{decomposition}}}}}) from Allison et al.’s^{53} model. We ignore the input of DOC to focus on the internal mechanism of DOC production driven by microbial enzyme decomposition. Additionally, a sensitivity analysis (see below, and Supplementary Table 1) of the deterministic approximation cdmz model (Supplementary Information 3.3) shows that *I*_{D} has very little influence on the state variables, including the SOC stock, *c*. SOC and DOC are assumed to leach out of the system at the same rate. Dead microbes and deactivated enzymes are recycled half into SOC and the other half into DOC (*p* = 0.5). The values of *p*_{mut} and *σ*_{mut} reflect the assumption that mutations are rare and small^{59}.

For the change of unit from biomass to individual entities (*ω*s), we used *Bacillus subtilis* or *B. clausii*, cellulase, cellulose and glucose as our baseline. We estimated the mass of 1 DOC molecule with the mass of 1 molecule of glucose. We estimated the mass of 1 SOC molecule from the approximation that 1 molecule of cellulose contains about 10^{3} molecules of glucose. We estimated the mass of 1 enzyme molecule by assuming that 1 molecule of cellulase contains about as much carbon as 1 molecule of cellulose. Finally, we estimated the mass of 1 cell based on the results from biomass estimations of soil samples (with various methods, such as CFI, CFE, SIR) that there are about 4 × 10^{8} active cells in 1 cm^{3} of bulk soil, which weigh 0.1 mg in carbon^{60}.

### Eco-evolutionary analysis

We perform extensive simulations of the resulting eco-evolutionary dynamics (joint dynamics of the ecological variables and trait distribution) at the microscale of single sites.

At the lattice scale, we analyze the evolution of exoenzyme production by using the spatial version of the hybrid model. To circumvent the issue of prohibitive computation time, we predict the long-term stationary state of eco-evolutionary dynamics by using an adaptive dynamics approach^{26,38}. To this end, rather than simulating the cell population with a continuous flow of new mutants, we parallelize the simulations of an ensemble of pairwise contests between only two slightly different strains at a time, one strain taken as resident (initially at stationary state) and the other as the initially rare, mutant strain. The direction and strength of selection on the evolving trait is then predicted by invasion fitness (intrinsic population growth rate from low density) of the mutant strain competing against the resident strain^{26,27}. Based on the mathematical theory of adaptive dynamics in finite populations^{42}, a proxy for invasion fitness is given by the product of the mutant probability of survival with the long-term population growth rate of surviving mutant populations. The rationale is that deleterious mutants may experience positive growth due to genetic drift, but their probability of survival is low; in contrast, advantageous mutants that differ only slightly from the resident strain tend to grow slowly, but their survival probability is high. An evolutionarily stable strategy (ESS) can then be estimated from pairwise contest simulations, as a trait value that no nearby mutant can invade, i.e. for which the estimated invasion fitness proxy of nearby mutants is close to zero. When invasion fitness indicates that for any given resident strain, mutants closer to the ESS outcompete the resident, then the ESS is attractive and the system’s eco-evolutionary dynamics converge to the ESS. In this case, the (attractive) ESS predicts the outcome of the full eco-evolutionary dynamics under the assumption of mutations of small effects^{37}.

### Simulation of resident-mutant interaction in the spatial model

We chose a standard lattice size of 10 × 10 microsites (in their related but non-evolutionary models, Allison^{17} and Kaiser et al.^{19,20} used lattices of 100 × 100 microsites) to avoid the prohibitive computation time of slow population spread across larger lattices. The observed patterns of spread conform to a relatively smooth invasion front, which makes us confident that the invasion capacity of a mutant phenotype is adequately captured by the dynamics following its introduction in a 10 × 10 lattice. The probability of invasion of a mutant might decrease with increasing lattice size if a larger lattice tended to increase the risk of stochastic extinction by drift; however, (1) extinction by drift is actually more likely to occur in the initial phase of invasion, when the spread is still limited to a small part of the lattice, and (2) even if this were the case, the consequence may be a flatter fitness landscape around the ESS, but the direction of selection towards the ESS should not be affected.

We start each simulation with a monomorphic population (all cells have the same *φ* value). All four variables *c*, *d*, *M*, *z* are initialized at the steady state values predicted by the deterministic model (Supplementary Eq. (S.5)) with the corresponding values of *k* and *φ*. Mutants are initially located at the center of the lattice (changing the initial location does not modify the final fraction of mutants in the lattice). On the edges of the lattice we impose Neumann boundary conditions whereby there is no disappearance of dispersing cells or DOC diffusing out of the lattice. Diffusion is then a transfer of *c*, *d*, *z* between the three (instead of four) neighboring microsites (or two for corner microsites) and the center microsite. Likewise, dispersing cells may move into one of three microsites (or two for corner microsites) instead of four. Looking for edge effects, we carefully monitor the potential accumulation of substrates or individuals at the boundary or in corners. No substantial such effect manifested over the time span over which mutant invasion develops.

To optimize simulation time, we assume that mutants occur initially at 5% frequency in the introduction microsite. We run simulations for (resident, mutant) pairs with ±0.05 difference in trait value *φ*. From the final frequency of mutants we compute the mutant exponential growth rate, and average over 20 simulation replicates. All simulations are different due to demographic stochasticity. The total simulation time of all parallelized runs was 10^{7} h (about 1000 years).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.