### Outline

In this section, we develop and analyse a mathematical model of infectious disease transmission that allows to investigate the conditions under which escape mutants of a pathogen can invade a host population that is under genetic selection for resistance to infection. The starting point of our analysis is a local, closed, host population (e.g., a herd of cattle without import of animals) that is endemically infected with a pathogen (the ‘wild-type’ pathogen). Hosts are then genetically selected for a single locus that confers some level of resistance to infection. Here, resistance merely implies that hosts are less likely to get infected with the wild-type pathogen, not necessarily that they cannot get infected at all. The aim of the selection is to reduce the prevalence of the infectious disease and, ultimately, to eradicate it from the local population.

Next, we assume that mutants of the pathogen that can to some degree escape host resistance can arise continuously as long as the wild-type pathogen is present in the host population. In other words, we assume that the escape pathogen is a mutant of the wild-type strain, so that it can arise only when the wild-type pathogen is (still) present. Our main interest is then to determine whether these escape mutants can invade the endemically infected host population and how the possibility to invade depends on characteristics of the host and of the pathogen, and on the degree of resistance against escape mutants provided by infection with the wild-type pathogen (cross resistance).

We start this section by describing the general epidemiological model that we use as the basis for our study. Next, we introduce genetic variation in the host population to allow for genetic selection for resistance in the host population. Here, we assume that host resistance is determined by a single bi-allelic locus, where resistance is either fully dominant or fully recessive. This model results in two host types: resistant hosts (\(R\)) and non-resistant hosts (\(N\)). Then we derive expressions for the prevalence of the pathogen in resistant and non-resistant hosts, and for the frequency of resistant hosts needed to eradicate the wild-type pathogen. Next, we further expand the model by allowing pathogen mutants to escape host resistance. Using this model, we assess how the possibility of the mutant to invade the host population depends on the frequency of resistant hosts, the level of resistance provided by the resistance gene, the fitness benefit of the escape mutant in resistant hosts, and the costs of the escape mutation for infection of non-resistant hosts. Finally, we incorporate infection of hosts with both types of pathogen at the same time into the model, to assess the effect of the degree of cross resistance on the possibility of the mutant to invade. To enhance comprehensibility of our complex model, we combined the methods and results sections in a fairly unusual way. The description of each step in the model, as outlined above, is followed by a short section with the results relevant to that step. Table 1 shows a notation key.

### SIS-model

Because the purpose of this study was to investigate the basic conditions under which pathogens can escape genetic resistance of hosts, without getting lost in mathematical details of specific cases, we decided to use the relatively simple, but very well-established Susceptible-Infectious-Susceptible (SIS) epidemiological model. This model provides a realistic representation of the transmission of several endemic infectious diseases [23], also in livestock populations.

In the SIS-model, host individuals can be in the susceptible (\(\mathrm{S}\)) or the infected (\(\mathrm{I}\)) state (Fig. 1). Infected individuals are also infectious, meaning that they can infect susceptible individuals. In the context of the SIS-model, the term “susceptible” merely means that the individual is not infected, so that it can become infected. It does not indicate any degree of susceptibility or resistance of the individual. Individuals ‘move’ between the two states: susceptible individuals can get infected following contact with an infected individual, while infected individuals can recover from infection and become susceptible again. The transition of individuals between these two states occurs at certain rates. The number of individuals that get infected per unit of time, and thus change from S to I, is known as the transmission rate. The number of individuals per unit of time that recover from infection, and thus change from I to S, is the recovery rate.

The transmission rate depends on the transmission rate parameter (\(\beta )\), on the number of susceptible individuals (\(S\); we will use italics \((S\) and \(I\)) to indicate the number of individuals in the corresponding state, and regular font (\(\mathrm{S}\) and \(\mathrm{I}\)) to indicate the state of an individual), and on the fraction of the contact individuals that are infected (\(\frac{I}{N}\), with \(N\) representing the total size of the local population) [24]. Thus,

$$Rate\left[\mathrm{S}\to \mathrm{I}\right]=\beta S\frac{I}{N}.$$

( 1)

The transmission rate parameter \(\beta\) is the average number of susceptible individuals that become infected per unit of time by one infected individual in an otherwise fully susceptible population (i.e. \(S=N\)), and reflects the transmissibility of the infection.

The recovery rate depends on the recovery rate parameter (\(\alpha\)) and on the number of infected individuals (\(I\)), i.e.:

$$Rate\left[\mathrm{I}\to \mathrm{S}\right]=\alpha I.$$

( 2)

The recovery rate parameter also determines the average duration of the infectious period, which is equal to \(\frac{1}{\alpha }\).

A key parameter in epidemiology is the basic reproduction ratio (\({R}_{0}\)), which is defined as the average number of secondary infections caused by a single typical infected individual in an otherwise fully susceptible population [25]. \({R}_{0}\) has a threshold function, i.e. when it is greater than 1, an infection may persist in the population, and a considerable fraction of individuals may become infected. When \({R}_{0}\) is less than 1, an infection is guaranteed to die out. Thus, to eradicate an infectious disease, it is essential that interventions reduce \({R}_{0}\) to less than 1.

For the SIS-model, a simple expression for \({R}_{0}\) can be derived using the expressions for the transmission and recovery rate (e.g. [23]). Given that \(\beta\) represents the transmission rate for one infected individual in a fully susceptible population, and that the single infected individual has an average infectious period of \(\frac{1}{\alpha }\), the number of secondary infections caused by this individual is equal to \(\beta /\alpha\). Thus, \({R}_{0}\) is equal to the product of the transmission rate parameter and the duration of the infectious period, i.e.:

$${R}_{0}= \frac{\beta }{\alpha }.$$

(3)

To simplify the mathematics, we assume a constant value of \(\alpha =1\), such that \({R}_{0}=\beta\). Note that we can use \(\alpha =1\) without loss of generality because one can always choose a time unit such that \(\alpha\) is equal to 1 and, therefore, \(\beta\) equals \({R}_{0}\). If \({R}_{0}\) is higher than 1, the SIS-model tends to a situation in which the number of newly infected individuals is equal to the number of recovering individuals, the so-called endemic equilibrium. In that situation, the average numbers of infected and susceptible individuals are stable. The fraction of infected individuals at this equilibrium, i.e. \({I}^{*}/N\), is the endemic prevalence of the infectious disease.

At equilibrium, the transmission rate is equal to the recovery rate:

$$\beta {S}^{*}\frac{{I}^{*}}{N}=\alpha {I}^{*}\underset{{R}_{0}= \frac{\beta }{\alpha }}{\iff } {R}_{0}{S}^{*}\frac{{I}^{*}}{N}={I}^{*}.$$

(4)

Solving for \({I}^{*}/N\), using \({S}^{*}=N-{I}^{*}\) shows that the endemic prevalence (\({P}^{*}={I}^{*}/N\)) is determined by \({R}_{0}\):

$${P}^{*}=1-\frac{1}{{R}_{0}}.$$

(5)

Having defined the fundamental epidemiological model and obtained expressions for \({R}_{0}\) and the endemic prevalence, we will expand the model in the next section by including genetic differences in host resistance.

### Heterogeneous SIS-model with host resistance to wild-type infection

In this section, we present a model for genetic variation in host resistance for a population that is exposed to the wild-type pathogen and present the results of this model. While the term “resistance” might suggest that resistant individuals cannot get infected at all, in practice resistance is rarely all-or-none. Hence, in the following, resistant merely means being less likely to get infected.

When properties of an individual or a pathogen affect the transmission of an infectious disease, they essentially alter the values of the underlying parameters, i.e., \(\beta\) and/or \(\alpha\), and thus \({R}_{0}\), since these parameters fully encompass the characteristics of the infection in the SIS-model (for examples, see also [26,27,28]). To include an effect of host resistance on transmission, we decrease \({R}_{0}\) by decreasing \(\beta\) (note that it does not matter for the results whether a decrease in \({R}_{0}\) originates from a reduction in \(\beta\) or an increase in \(\alpha\); see “Discussion”). Furthermore, we assume that, once infected, both non-resistant and resistant hosts are equally likely to infect susceptible individuals. This means that there is no difference in the “infectivity per unit of time” between resistant and non-resistant hosts.

Now that we have defined how host resistance acts on transmission of the infection, the next step is to define the genetic model for host resistance. Given that derivations for transmission models with many levels of host resistance rapidly become very complex, we will use one of the simplest genetic models for diploid individuals. We assume that resistance is defined by a single locus, which is either fully dominant or fully recessive, such that the host population consists of two types: resistant (with frequency \({f}_{R}\)) and non-resistant (\(1-{f}_{R}\)). Thus, in this model, we have two types of hosts, resistant (\(R\)) and non-resistant (\(N\)), and one type of pathogen, the wild-type pathogen (\(W\)). Using this genetic model, we can set up a SIS-model with two types of hosts, resistant hosts (subscript \(R\)) and non-resistant hosts (subscript \(N\); this is the blue subsystem in Fig. 2 and the full model equations are in Appendix 1). The number of susceptible hosts of each type in the local population are denoted by \({S}_{R}\) and \({S}_{N}\). Analogously, the number of hosts of each type that are infected with the wild-type pathogen (subscript \(W\)) are denoted by \({I}_{RW}\) and \({I}_{NW}\).

Transmission involves a pair of individuals: an infected donor individual, and a susceptible recipient individual. Here, we assume that the pair-wise transmission rate parameter depends on the resistance genotype of the recipient but not on the genotype of the donor. This is equivalent to the absence of genetic variation in infectivity. Since we use \(\alpha =1\), the transmission rate parameters are directly equal to the basic reproduction ratios (Eq. 3). In a contact between an infected and a susceptible (i.e., non-infected) host, these basic reproduction ratios (\({R}_{NW}\) and \({R}_{RW}\)) are defined for the recipient individual, because we model variation in resistance. The subscript (\(N\) or \(R\)) thus reflects the genotype of the susceptible host (\(W\) refers to the wild-type pathogen).

Just like the homogeneous SIS-model, the heterogeneous SIS-model also tends to an equilibrium. However, because of the difference in resistance between individuals, the endemic prevalence in this equilibrium is no longer equal to Eq. (5) because non-resistant individuals are more likely to become infected than resistant individuals. Consequently, at equilibrium, the susceptible individuals are predominantly of the resistant type, which are less likely to become infected. As a result, the overall endemic prevalence at equilibrium is a bit lower than in the model without heterogeneity [26, 27] and is equal to:

$${P}_{W}^{*}=\frac{{I}_{NW}^{*}+{I}_{RW}^{*}}{N},$$

(6)

and is reached when both host types have reached their equilibrium [4], i.e.:

$${R}_{NW}{S}_{N}^{*}\frac{{I}_{W}^{*}}{N}={I}_{NW}^{*}\, \mathrm{and} \,{R}_{RW}{S}_{R}^{*}\frac{{I}_{W}^{*}}{N}={I}_{RW}^{*}.$$

(7)

Solving Eqs. (6) and (7) using \({S}_{N}^{*}=\left(1-{f}_{R}\right)N-{I}_{NW}^{*}\) and \({S}_{R}^{*}={f}_{R}N-{I}_{RW}^{*}\), results in equilibrium solutions for the endemic prevalence in both host types and for the overall endemic prevalence in the population. The resulting equations are complex and are given in Appendix 2, together with detailed derivations (Eqs. 20, 21, 22, 23, 24, 25, and 26). We will use figures to illustrate the results.

### Results for the heterogeneous SIS-model with host resistance to wild-type infection

We consider the situation where the wild-type pathogen is endemic in a non-resistant host population, while the infection is absent in a population where all individuals have the resistant genotype due to herd immunity (\({R}_{0}<1\)). This represents the most beneficial situation for genetic selection for host resistance, because it will lead to eradication of the infection (in this section, escape mutants are ignored). This situation corresponds to \({R}_{NW}>1\), so that \({R}_{0}>1\) when \({f}_{R}=0\), i.e. the infection is endemic in a non-resistant host population, and \({R}_{RW}<1\), so that \({R}_{0}<1\) when \({f}_{R}=1\), i.e. the infection is absent in a resistant host population. Note that \({R}_{RW}<1\) results in the infection to be absent because of herd immunity, but this does not imply that resistant individuals cannot get infected at all (see also results below). We will also show a situation where \({R}_{RW}>1\), to illustrate what will happen when the resistance is not sufficient to fully eradicate the infection.

In a population with both resistant and non-resistant individuals, the prevalence and whether the infection is present or not, depend on the frequency of resistant hosts (\({f}_{R}\)). Figure 3 shows the endemic prevalence of the infection (\({P}_{W}^{*}\)) in a population that consists of a mix of resistant and non-resistant hosts, as a function of \({f}_{R}\), and for three values of the basic reproduction ratio for resistant hosts (\({R}_{RW}\)). The basic reproduction ratio for non-resistant hosts (\({R}_{NW}\)) was set to 1.5. Figure 3 clearly shows that the overall prevalence decreases with increasing frequency of resistant hosts, at a rate that depends on the value of \({R}_{RW}\), i.e. prevalence decreases faster when \({R}_{RW}\) is lower. At a low frequency of resistant hosts, virtually all infected hosts are from the non-resistant type (\({P}_{NW}^{*}\); dashed line in Fig. 3), which makes sense given the low frequency of resistant hosts and their lower susceptibility. Note that prevalence is expressed relative to the total population size and not to the number of individuals of a given type, i.e., as \({I}_{RW}^{*}/N\) rather than \({I}_{RW}^{*}/({f}_{R}N)\)*.* At a higher frequency of resistant hosts (e.g., \({f}_{R}=0.25\)), a larger fraction of the infections occurs in resistant hosts (\({P}_{RW}^{*}\); dashed-dotted line in Fig. 3). This happens even when the reproduction ratio of resistant hosts is lower than 1, because the overall \({R}_{0}\) is higher than 1, leading to maintenance of the infection in the population. For \({R}_{RW}\) of 0.1 and 0.8, the overall prevalence decreases to 0 at a certain \({f}_{R}\). This is the frequency of resistant hosts above which the infection is expected to die out, because the greater resistance of the population reduces the overall reproduction ratio of the infection to less than 1 (herd immunity). For \({R}_{RW}\) of 1.1, this point does not exist, because the basic reproduction ratio in a fully resistant population is still above the threshold of 1, implying that the infection will persist in the population.

For \({R}_{RW}<1\), the \({f}_{R}\) at which the infection dies out can be found by realising that the overall \({R}_{0}\) should be higher than 1 for the infection to persist in the population. For our model, the overall basic reproduction ratio is the average of the type-specific reproduction ratios, weighed by the frequencies of both types [28], i.e.:

$${R}_{0}=\left(1-{f}_{R}\right){R}_{NW}+{f}_{R}{R}_{RW.}$$

(8)

Solving \({R}_{0}=1\) for \({f}_{R}\) results in the upper limit of \({f}_{R}\), below which the infectious disease can persist in the population, i.e.:

$${f}_{{R}_{max}}=\frac{{R}_{NW}-1}{{R}_{NW}-{R}_{RW}}.$$

(9)

Using the values for \({R}_{NW}\) and \({R}_{RW}\) from Fig. 3, we find solutions for \({f}_{{R}_{max}}\) of \(0.5/1.4=0.36\) for \({R}_{RW}=0.1\) and \(0.5/0.7=0.71\) for \({R}_{RW}=0.8\). These results represent the minimum frequency of hosts with the resistant genotype in a population that is required to eradicate an infection, which is relevant for selection.

In a closed population, \({{f}_{R}}_{max}\) also sets an upper bound to the invasion possibility of escape mutants, because the infection with the wild-type pathogen dies out when \({f}_{R}>{{f}_{R}}_{max}\). When the wild-type infection has died out, mutants cannot develop anymore in a closed local population, simply because there are no wild-type pathogens to mutate from. The next section provides further details on the important role of \({{f}_{R}}_{max}\) when the model is expanded by allowing for the development of pathogen mutants that can escape host resistance.

### Heterogeneous SIS-model with introduction of an escape mutant

When we want to include pathogen mutants that can escape host resistance into our model, where resistance is not absolute, pathogen escape entails that the mutant pathogen is better able to infect resistant hosts than the wild-type pathogen. We will incorporate one escape mutant into the model, such that there are two pathogen types, a wild-type pathogen (\(W\)) and an escape mutant (\(E\)), each of which can infect both host types, albeit the escape mutant will more easily infect a resistant host than the wild-type pathogen. This does not mean that we only consider one type of mutant, it merely means that we will look at the competition between a certain mutant and the wild-type pathogen one at a time. Furthermore, we do not consider the probability that an escape mutation occurs but focus on the case where escape mutants are present, because a mutation will occur sooner or later, as long as the wild-type pathogen is present in the population.

In this section, we assume that infection with one of the two pathogen types offers full cross-resistance against infection with the other type, such that hosts can be infected with only one pathogen type at a time (this assumption will be relaxed in the next section). It means that we consider two additional types of infected individuals, i.e., \(N\) and \(R\) host types that are infected with the escape (\(E\)) mutant, such that we have four types of infected individuals in total (the orange subsystem in Fig. 2; note that we still have two types of susceptible individuals). We will model transmissibility of escape mutants by including two additional reproduction ratios in our model, such that transmission between a pair of individuals no longer depends only on the type of susceptible host (\(N\) or \(R\)), but also on the type of pathogen that is carried by the infected host (\(W\) or \(E\)). These two reproduction ratios are for the escape mutant infecting non-resistant or resistant susceptible hosts, denoted by \({R}_{NE}\) and \({R}_{RE}\), respectively. Selection pressure on the pathogen population is then determined by the four reproduction ratios and the frequency of resistant hosts in the population.

Mutants typically arise in a host population that is endemically infected with the wild-type pathogen. Thus, to determine whether an escape mutant can spread in the host population, we need to derive an expression for the reproduction ratio of the escape mutant in a host population in which the wild-type pathogen is at endemic equilibrium. We will call this the invasion reproduction ratio of the escape mutant (\({R}_{INV}\)). When \({R}_{INV}>1\), the escape mutant can invade a host population that is endemically infected with the wild-type pathogen. When \({R}_{INV}<1\), escape mutants might occur, but they cannot spread.

We can derive \({R}_{INV}\) by applying the definition of \({R}_{0}\), but this time for a population where the wild-type pathogen is endemic, instead of for a fully susceptible population. To find \({R}_{INV}\), given full cross-resistance (i.e. hosts that are infected with one of the pathogen types cannot simultaneously get infected with the other type), we need to multiply the basic reproduction ratios of the escape mutant with the fraction of susceptible individuals (i.e., all individuals of a given genotype that are not yet infected with the wild-type pathogen). These fractions are given by \((1-{f}_{R})-{P}_{NW}^{*}\) for non-resistant hosts, and by \({f}_{R}-{P}_{RW}^{*}\) for resistant hosts. Then, the reproduction ratio for invasion of the escape mutant into a host population that is endemically infected with the wild-type pathogen becomes:

$${R}_{INV}={R}_{NE}\left[\left(1-{f}_{R}\right)-{P}_{NW}^{*}\right]+{R}_{RE}\left({f}_{R}-{P}_{RW}^{*}\right).$$

(10)

\({R}_{INV}\) thus depends on \({f}_{R}\) and the four reproduction ratios, two of which are shown explicitly in Eq. (10) (\({R}_{NE}, {R}_{RE}\)), while the other two (\({R}_{NW}\) and \({R}_{RW}\)) are implicit in the expressions for the endemic equilibrium prevalences \({P}_{NW}^{*}\) and \({P}_{RW}^{*}\) (see Eqs. 24 and 25 in Appendix 2).

### Results for the heterogeneous SIS-model with introduction of an escape mutant

Figure 4 shows \({R}_{INV}\) (dashed line, right y-axis) as a function of \({f}_{R}\) for reproduction ratios \({R}_{NW}=1.5\), \({R}_{NE}=1.1,{R}_{RE}=1.5\), and \({R}_{RW}=0.8\) (Fig. 4a) or \(0.1\) (Fig. 4b). The endemic prevalence of the wild-type pathogen (\({P}_{W}^{*}\); solid line; left y-axis) is shown in both panels as a function of \({f}_{R}\), similar to Fig. 3. We chose to simulate \({R}_{RW}<1\), such that the wild-type pathogen will die out in a fully resistant population (see previous section). This is visible in Fig. 4 as the solid line that decreases to \({P}_{W}^{*}=0\). By assumption, the escape mutation comes with a cost in fitness for the pathogen, such that the escape mutant will be outcompeted by the wild-type pathogen in a non-resistant population. However, due to its adaptation to the resistant host, it can persist in a fully resistant population, in contrast to the wild-type pathogen. This is visible in Fig. 4 as the increasing dashed line.

At a certain \({f}_{R}\), the reproduction ratio for invasion of the escape mutant (\({R}_{INV}\); dashed line in Fig. 4) crosses the threshold \({R}_{INV}=1\) (indicated by the dotted line in Fig. 4), above which the escape mutant can invade the population. This is a critical point that represents the lower bound of the invasion window, there is a risk of escape mutants invading the (partly) resistant host population when the frequency of resistant hosts is greater than this lower bound. Together with the upper bound of the wild-type pathogen dying out, as defined in the previous paragraph, the point \({R}_{INV}=1\) determines the critical range of the frequency of resistant hosts in which an escape mutant pathogen can both arise (\({P}_{W}^{*}>0\)) and invade (\({R}_{INV}>1\)). This range of \({f}_{R}\) is indicated as the ‘invasion window’ in Fig. 4.

Comparing panels (a) and (b) of Fig. 4, we see that a stronger effect of the resistance gene of the host (lower \({R}_{RW}\)) not only results in a lower \({f}_{R}\) that is required to eradicate the wild-type pathogen (\(0.36<0.71\)), but also in a narrower invasion window, because of a lower upper bound. A stronger effect of the resistance gene (i.e., lower \({R}_{RW}\)) thus reduces the risk of invasion of escape mutants because it results in a smaller invasion window. A smaller invasion window can be passed more quickly by artificial selection of the host population.

In summary, in this section we have derived an expression for the reproduction ratio for the invasion of an escape mutant in hosts that are endemically infected with the wild-type pathogen, assuming that hosts can be infected with only one pathogen at a time (full cross-resistance). In the next section, we will relax this assumption and expand our model to allow for infection of hosts with both types of pathogen at the same time.

### Heterogeneous SIS-model, allowing for double infections

Although the two pathogen types are closely related, infection with the one type may not provide full resistance to infection with the other type. Coexistence of different strains (incomplete cross-resistance) is common for many bacteria (e.g. [29,30,31]). However, for viruses such as influenza A, infection with the wild-type pathogen may give substantial resistance to infection with the escape mutant and vice versa [32]. If cross-resistance is not complete, hosts that are already infected with one pathogen type can get infected with another as well. Thus, the possibility of double infection leads to an additional transmission route, from singly-infected to doubly-infected hosts. If some cross-resistance is present, this leads to a lower rate for a second infection compared to the rate of first infection. We will model this effect by including a cross-resistance factor (\(1-r\)) in the transmission rate from singly- to doubly-infected hosts (the green subsystem in Fig. 2). Parameter \(r\) takes values between 0 and 1; 0 for no cross-resistance and 1 for full cross-resistance (no double infections occur). For example, non-resistant hosts that are infected with the wild-type pathogen get infected with the escape mutant at rate \((1-r){R}_{NE}{I}_{NW}\frac{{I}_{E }}{N}\). Once infected, we assume that doubly-infected hosts are equally likely to infect a susceptible host with one of the two pathogen types as singly-infected hosts.

To determine whether an escape mutant can invade under these conditions, we need to adapt Eq. (10) because susceptible individuals as well as individuals that are already infected with the wild-type pathogen can become infected by the escape mutant. The equation for the invasion reproduction ratio of an escape mutant in a population that is endemically infected with the wild-type pathogen then becomes:

$${R}_{INV}\,=\,{R}_{NE}\left[(1-{f}_{R})-{P}_{NW}^{*}+(1-r){P}_{NW}^{*}\right]+{R}_{RE}\left[{f}_{R}-{P}_{RW}^{*}+(1-r){P}_{RW}^{*}\right].$$

(11)

This expression allows us to investigate how invasion of an escape mutant is affected by incomplete cross-resistance.

### Results for the heterogeneous SIS-model that allows for double infections

Figure 5 shows the effect of incomplete cross-resistance on the invasion window of the escape mutant. Figure 5 is similar to Fig. 4 (which implicitly has \(r=1\)), but now \({R}_{INV}\) is shown for three levels of cross-resistance (0, 0.5 and 1). Reproduction ratios are the same as those used in Fig. 4. Incomplete cross-resistance (\(r<1\)) increases the width of the invasion window. When \(r=0\), the window covers the whole range in \({f}_{R}\) where the wild-type pathogen is present. This happens because \({R}_{NE}\) is greater than 1 and, when \(r=0\), all hosts infected with the wild-type pathogen can still get infected with the escape mutant as well, because, in this case, there is no competition between the pathogen types. As without double-infected hosts, a lower \({R}_{RW}\) (Fig. 5b) decreases the width of the invasion window. In the next section, we will further investigate how the reproduction ratios affect the width of the invasion window at different levels of cross-resistance.

### Factors affecting the width of the invasion window

As shown in the previous paragraphs, the invasion opportunity of escape mutants in a host population that is genetically selected for resistance is restricted by two bounds: (1) a lower bound that represents the degree of resistance of the population at which the invasion reproduction ratio (\({R}_{INV}\)) becomes greater than 1, and (2) an upper bound that represents the point above which the wild-type pathogen dies out (\({{f}_{R}}_{max}\), Eq. 9). Since we obtained an expression for \({{f}_{R}}_{max}\), we tried to solve \({R}_{INV}=1\) for \({f}_{R}\) algebraically, to determine the effect of the four reproduction ratios and the level of cross-resistance on the width of the invasion window. Unfortunately, this resulted in a closed form solution only when there is either no cross-resistance or full cross-resistance (\(r=0\) or \(r=1\)). Thus, we will investigate the width of the invasion window numerically by taking one set of values for the four reproduction ratios as the default scenario (\({R}_{NW}=1.5, {R}_{RW}=0.8, {R}_{NE}=1.1, {R}_{RE}=1.5\)), and then vary one of them at a time (except \({R}_{NW}\)). For cross-resistance, \(r\) will take values of 0, 0.5, and 1.

Figure 6a shows the size of the invasion window as a function of the frequency of resistant hosts and the basic reproduction ratio of the wild-type pathogen in resistant hosts (\({R}_{RW}\)). We can see that the window becomes smaller with decreasing \({R}_{RW}\), as was already visible in Figs. 4 and 5. As stated before, this implies that a stronger effect of the resistance gene (i.e., lower \({R}_{RW}\)) decreases the risk of invasion of an escape mutant, because the wild-type pathogen goes extinct sooner. The decreasing width of the window with a lower \({R}_{RW}\) is mainly caused by the effect on \({{f}_{R}}_{max}\) (solid dark green line in Fig. 6a), which becomes closer to 1 with increasing \({R}_{RW}\). Hence, a change in \({R}_{RW}\) mainly impacts the upper bound of the invasion window.

The lower bound is less affected by a change in \({R}_{RW}\); and especially for low \(r\), the line \({R}_{INV}=1\) is almost vertical in Fig. 6a. This happens because at lower levels of cross-resistance, there is less competition between the two pathogen types, since (part of) the hosts that are infected with one pathogen type can also become infected with the other type. In the most extreme case, when there is no cross-resistance (\(r=0\)), the lower bound is not affected by the frequency of resistant hosts (Fig. 6a). This means that \({R}_{RW}\), and thus the endemic equilibrium prevalence of the wild-type pathogen, has no effect on the risk of invasion of an escape mutant when \(r=0\).

We restricted the y-axis of Fig. 6a to a maximum of 1, because the pathogen can persist in resistant hosts when \({R}_{RW}>1\), and it is thus impossible to eradicate the wild-type pathogen. Genetic selection for infectious disease resistance will in that case not be sustainably beneficial because escape mutants will eventually occur, even at a high frequency of resistant hosts. Thus, to prevent invasion of escape mutants, \({R}_{RW}\) needs to be less than 1, either through the effect size of the resistance genes or by taking additional measures in combination with genetic selection to achieve \({R}_{RW}\) less than 1.

Figure 6b shows the size of the invasion window as a function of the frequency of resistant hosts and the basic reproduction ratio of the escape mutant in non-resistant hosts (\({R}_{NE}\)). The value of \({R}_{NE}\) relative to \({R}_{NW}\) reflects the fitness costs for the escape mutant in a non-resistant host population, relative to the wild-type pathogen. If \({R}_{NE}\) is much lower than \({R}_{NW}\), the escape mutant spreads much less in non-resistant hosts than the wild-type pathogen, i.e., the escape mutation comes with high fitness costs (low *y*-axis value). If \({R}_{NE}\) is close or equal to \({R}_{NW}\), differential fitness costs are low or absent and the spread of the escape mutant in non-resistant hosts is similar to the spread of the wild-type pathogen (high *y*-axis value). Consequently, the size of the invasion window increases as \({R}_{NE}\) moves closer to \({R}_{NW}\). At a certain \({R}_{NE}\), the invasion window covers the whole range of \({f}_{R}\) in which the wild-type pathogen can persist. When there is full cross-resistance (top dashed line, \(r=1\)), this point occurs when \({R}_{NE}\) is equal to \({R}_{NW}\) (= 1.5), meaning that the escape mutant spreads equally well as the wild-type pathogen in non-resistant hosts. With no cross-resistance (bottom dashed line, \(r=0\)), this point occurs when \({R}_{NE}\) is equal to 1. If there is no cross-resistance, the presence or absence of the wild-type pathogen has no effect on the spread of the escape mutant and, thus, a basic reproduction ratio greater than 1 is sufficient for the mutant to persist. We can also see in Fig. 6b that the invasion window stays relatively small for lower values of \({R}_{NE}\) and only covers a range in \({f}_{R}\) of about 0.05 when \({R}_{NE}\) is close to 0. This indicates that it is extremely difficult for escape mutants to invade when they experience high fitness costs in non-resistant hosts.

Figure 6c shows the size of the invasion window as a function of the frequency of resistant hosts and of the basic reproduction ratio of the escape mutant in resistant hosts (\({R}_{RE}\)). The size of the invasion window increases with \({R}_{RE}\), but less so when cross-resistance is high. If there is a little cross-resistance, the mutant can already invade if \({R}_{RE}\) is slightly less than 1 (top dashed lines for \(r=0.5\) and \(r=1\) in Fig. 6c), because \({R}_{NE}>1\) (1.1) and \({R}_{RW}<{R}_{RE}\) in that case, i.e., host resistance is more effective against the wild-type pathogen than against the escape mutant. Thus, in a very small range of \({f}_{R}\), “escape” mutants might invade even though their reproduction ratio in resistant hosts is less than 1. The increasing slope of the line \({R}_{INV}=1\) when there is no cross-resistance (bottom dashed line in Fig. 6c, \(r=0\)), mainly results from \({R}_{NE}\) being greater than 1. Figure 6b shows that at an \({R}_{NE}\) of 1.1 and no cross-resistance, the invasion window covers the full range in \({f}_{R}\) from 0 to \({{f}_{R}}_{max}\). So, in a population that consists of mainly non-resistant hosts (left side of Fig. 6), the escape mutant will be well able to invade, as explained in a previous paragraph. At a higher frequency of resistant hosts, as with the wild-type pathogen, the transmission of the escape mutant in resistant hosts becomes more and more determining for the invasion probability, indicated by \({R}_{RE}\). If \({R}_{RE}\) is less than 1, host resistance is not only effective against the wild-type pathogen but also against the “escape” mutant, which is evident for the increase in the lower bound of the invasion window (the line \({R}_{INV}=1\)) with increasing \({f}_{R}\) for \(r=0\) in Fig. 6c. The combination of reproduction ratios of the escape mutant in non-resistant and resistant hosts, together with cross-resistance, thus determines its opportunity to invade.