Appendix A.1. Graphical model
Suppose we grow a bacterial population inside a flask, where diet and space requirements are fulfilled for up to N cells and no cell dies. Taking N as the carrying capacity and considering a constant growth rate r associated with each individual in the bacterial population, we have a pure birth process where the population size reaches the carrying capacity in units of time. Here, as in many biological contexts, time is measured in generations; i.e., is such that . Additionally, let’s suppose is the mutation rate to mutator phenotypes and is the mutation rate of mutator cells to the marker phenotype.
With the above assumptions, we have a pure birth process called Yule process [49], and the expected number of mutations to the mutator phenotype occurring from the beginning of the experiment until time
is
Moreover, if
M is a random variable modeling the number of mutations occurring in the time interval
,
M follows a Poisson distribution [50], so
Our framework leads to the following simplification, the bacterial population grows exponentially; this is, given then , with , denotes the population size at time t.
Let
be a function such that
Hence, denotes the population size at time t compared to the final population, as a fraction.
Let be a random vector, where and then C represents a mutant cell that appeared at time T and belongs to the fraction F of the final population. By considering a random sample of size M from C, , and scattering them within , we obtain feasible and not feasible mutants since, regarding f as a probability density function, all feasible mutations lie under the graph of f and disregard them otherwise.
Figure A1 shows an example of the method above described.
Figure A1.
Graphical model for selecting feasible mutations. mutations were generated as random points on . The parameter for final population size was , for wildtype mutation rate was , and for the final time was . The graph of (black curve) played a discriminant role since all points below the graph of f (red points) represent actual mutations while the rest of the points (blue points) were disregarded.
Figure A1.
Graphical model for selecting feasible mutations. mutations were generated as random points on . The parameter for final population size was , for wildtype mutation rate was , and for the final time was . The graph of (black curve) played a discriminant role since all points below the graph of f (red points) represent actual mutations while the rest of the points (blue points) were disregarded.
Let be the number of actual mutations from wildtype to mutator cells. Without loss of generality, let’s suppose is the mutant cell associated with the i-th mutation, . Let’s note that each mutant cell , appearing at time , has units of time to give rise to its clonal linage; i.e. a subpopulation of size . Now, in a Yule process the clone population size that starts with individuals, each one with growth rate , has a negative binomial () distribution with parameters j and ([49],[pp. 377–378; [51], [Eq. 3.15; [52]).
Therefore,
is a random variable such that
since
denotes the number of cells in the linage of cell
, excluding the initial cell
. This is due to the support of the negative binomial distribution we are computationally working with [53, func.
rnbinom],
. When the support of this distribution is
, as in ([49], pp. 377–378; [51], Eq.
3.15; [52]), Equation (
A4) should be rewritten as
Furthermore, in both cases, the total number of mutators in the population at time
is
To incorporate mutant cells with mutator phenotypes, let’s suppose they appear as the result of mutations within the mutator cells subpopulation. Then, mutant cells with mutator phenotypes can appear only along the growth cycle of each mutant cell , .
Let be a fixed index. We know there are exactly mutator cells that have been raised by mutator cell , and if any mutant cell with a mutator phenotype is appearing will be in the time interval . Setting as the carrying capacity, mutant cells with mutator phenotypes have units of time to appear, which happens with frequency .
Consequently, under a similar reasoning for Equation (
A2), we have that the number of mutations leading to mutant cells with mutator phenotypes is a random variable
such that
Let , with and , a random sample of size . Hence, each , is a random point on , representing a potential mutant cell with a mutator phenotype.
Analogous to Equation (
A3), function
defined by
with
, plays the discriminant role for the
mutants cells with mutator phenotypes. Where, as before, all points except those below the graph of
were disregarded since they represent actual mutant cells with mutator phenotypes.
Let
be the number of actual mutant cells with mutator phenotypes that appeared along the growth cycle of mutator cell
. Without loss of generality, let’s say
represent mutant cells with mutator phenotypes then, for each
, the mutant cell with a mutator phenotypes
has
units of time to give rise to its offspring of size
, which, according to Equation (
A4),
is a random variable such that
for every
.
By repeating this process for each
we have that the total number of mutators in the population at time
is
Furthermore, we can study the distribution of the numbers of mutator cells and mutant cells with mutator phenotypes, as shown in
Figure A2.
For a pseudocode description of the graphical model, see Algorithm 4.
Figure A2.
One thousand simulations of graphical method. (a) Histogram of the number of mutants on log scale and (b) histogram of the number of mutators on log scale. Given the parameters in
Figure A1 and a mutation rate to mutator phenotype
we simulated 1000 cultures in 354 seconds (printing plots) and in 5 seconds (printing no plots) described method. The distributions of the numbers of mutants and mutators were heavy-tailed, as expected.
Figure A2.
One thousand simulations of graphical method. (a) Histogram of the number of mutants on log scale and (b) histogram of the number of mutators on log scale. Given the parameters in
Figure A1 and a mutation rate to mutator phenotype
we simulated 1000 cultures in 354 seconds (printing plots) and in 5 seconds (printing no plots) described method. The distributions of the numbers of mutants and mutators were heavy-tailed, as expected.
Appendix A.2. Quantile function model
Let’s recall that given a function
we say it is invertible if there exists a function
(called the inverse function of
g) such that
Definition A1 (Quantile function).
Let be a cumulative distribution function. The quantile function of F is defined by
.
Quantile functions generalize the concept of the inverse of a function g when g is a cumulative distribution function. Now, quantile functions have a remarkable application, often called the inversion method, which is used to simulate random variables [Proposition 2 [54], a tool used in the following method.
Under assumptions and using the notation of the graphical model, given
,
denotes the population size at time
t. Then
denotes the expected number of mutant cells at time
t, which, as a function, can be regarded as a probability density function. Furthermore, let
then
given by
is a probability density function, where its cumulative distribution function is
defined by
Hence its quantile function is
given by
So if
, then
is such that
, by the inversion method.
Applying the inversion method as above, only actual mutations leading to mutator phenotypes and their respective appearing time are provided (see
Figure A3). Thus, instead of asking for the expected number of mutations, as in Eq. (
A1), it is sufficient to ask for the expected number of mutators in the population, which is
Therefore, Equation (
A2) is rewritten as
Then we can get the appearing time of
M mutator cells by generating a random sample from
V of size
M,
. Note that with this framework there is no need to disregard any point. Therefore, the number of actual mutations
coincides with
M, which implies Equation (
A4) is rewritten as
and then the total number of mutators in the population at time
is
To incorporate mutant cells with mutator phenotypes we proceed as in Eq. (
A15). For each
, the expected number of mutants within the
i-th mutator’s offspring of size
is
In consequence, the random variable,
, counting the number of mutations leading to mutant cells with the
i-th mutator phenotype is such that
Now, given
a fixed index, let
then
defined by
is the probability density function whose quantile function
given by
Thus if
, then
is such that
, where
.
Let
a random sample from
, then
denotes the appearing time of the
j-th mutant cell with the
i-th mutator phenotype. Therefore, analogous to Eq. (
A17), the size of their offspring is a random variable
such that
Implying that the number of mutant cells with mutator phenotypes is
For a pseudocode description of the quantile function method, refer to Algorithm 5.
Figure A3.
Comparison between simulated data and actual probability density function. Parameters were the same as in
Figure A1. Data for the histogram of the appearing time of mutant cells were simulated by the inversion method and Equation (
A14). The graph of
(black curve) was included for reference since simulated data has distribution
h.
Figure A3.
Comparison between simulated data and actual probability density function. Parameters were the same as in
Figure A1. Data for the histogram of the appearing time of mutant cells were simulated by the inversion method and Equation (
A14). The graph of
(black curve) was included for reference since simulated data has distribution
h.
Figure A4.
(a) Histogram of the number of mutants on log scale and (b) histogram of the number of mutators on log scale. The same parameters of
Figure A2 were used to simulate 1000 cultures in two seconds using the quantile function method.
Figure A4.
(a) Histogram of the number of mutants on log scale and (b) histogram of the number of mutators on log scale. The same parameters of
Figure A2 were used to simulate 1000 cultures in two seconds using the quantile function method.
Appendix A.3. Description of the Moran model
We started the simulation with a single individual referred to as a normal cell (neither mutator nor antibiotic resistance). In order to simulate a pure birth process, a single individual is chosen at random from the population to reproduce. This individual divides into two equal daughter cells, and the mother cell is removed from the population.
At each birth event, each normal daughter cell may mutate to the antibiotic resistant phenotype at mutation rate , or to the mutator phenotype at rate . Individuals with the mutator phenotype have a mutation rate that is k times higher than the normal mutation rate. When either mutation happens, all subsequent offspring inherit the new phenotype.
We note that the mutator phenotype does not have antibiotic resistance directly, but its offspring can develop antibiotic resistance in the future. In particular, daughters of a mutator cell develop antibiotic resistance with mutation rate , while daughters of an antibiotic resistant cell become mutators at rate . Consequently, the population consists of four types of individuals: normal cells, antibiotic resistant cells, mutator cells, and individuals exhibiting both antibiotic resistance and the mutator phenotype.
In summary, the simulation proceeds by choosing an individual at random to reproduce, removing that individual, and determining which types of daughters are added in its place, as described above. This process is repeated until the population size reaches the desired final size (e.g.
). We can also simulate repeated growth phases and population bottlenecks by randomly sampling the final population to achieve a desired initial population size (e.g.
, and repeating the procedure above. See Algorithm 1, 2, and 3 for a detailed pseudocode description.
Algorithm 1: Moran model definitions |
Input:
: Mutation rate from wildtype to antibiotic resistance.
: Mutation rate from wildtype to mutator.
: Mutation rate from mutator to antibiotic resistance.
: Final population size.
Ntotal: Current population size, total of all types.
NAMB[0], NAMB[1], NAMB[2], NAMB[3] : Numbers of each of four types of individuals including normal cells, antibiotic-resistant cells, mutator cells, and individuals including both antibiotic-resistant and mutator phenotype respectively.
nreps=1000: Number of replicate lines.
cumsum: Cumulative frequency of each type in the population, set cumsum[3]=1.
pinit = [0,20,20,20]: (see below)
Ninit: initial population size for each growth phase, .
bottle: Bottleneck sampling fraction.
r, , : Random numbers.
ngrowths = 4: number of growth phases.
|
Algorithm 2: Moran model pseudocode |
|
Algorithm 3: Moran model pseudocode, continued. |
|
Algorithm 4: Graphical model pseudocode. |
|
Algorithm 5: Quantile function model pseudocode. |
|