Bayesian segregation analysis of milk flow in Swiss dairy cattle using Gibbs sampling

Segregation analyses with Gibbs sampling were applied to investigate the mode of inheritance and to estimate the genetic parameters of milk flow of Swiss dairy cattle. The data consisted of 204 397, 655 989 and 40 242 lactation records of milk flow in Brown Swiss, Simmental and Holstein cattle, respectively (4 to 22 years). Separate genetic analyses of first and multiple lactations were carried out for each breed. The results show that genetic parameters especially polygenic variance and heritability of milk flow in the first lactation were very similar under both mixed inheritance (polygenes + major gene) and polygenic models. Segregation analyses yielded very low major gene variances which favour the polygenic determinism of milk flow. Heritabilities and repeatabilities of milk flow in both Brown Swiss and Simmental were high (0.44 to 0.48 and 0.54 to 0.59, respectively). The heritability of milk flow based on scores of milking ability in Holstein was intermediate (0.25). Variance components and heritabilities in the first lactation were slightly larger than those estimates for multiple lactations. The results suggest that milk flow (the quantity of milk per minute of milking) is a relevant measurement to characterise the cows milking ability which is a good candidate trait to be evaluated for a possible inclusion in the selection objectives in dairy cattle.


INTRODUCTION
In dairy cattle, yields of milk and milk solids are of primary economic concern; however other traits, particularly milking ability (milk flow or milking time), longevity, disease resistance and reproductive efficiency are also of economic importance. An increase in milk flow is associated with a decrease in milking labour time, which significantly reduces the cost of milking via less electrical power and wear and tear of milking equipments.
In livestock populations, large volumes of phenotypic observations are often available at low costs and it is worthwhile to use them to look for statistical evidence of major genes or quantitative trait loci (QTL) by statistical analysis. Segregation analysis is the most powerful statistical method to identify a single gene when DNA marker information is unavailable. With segregation analysis, it is possible to determine, using only phenotypic data, whether the inheritance of a certain trait is controlled, at least in part, by a single gene with a large effect. Therefore, segregation analysis (e.g. [6] and [12]) could be performed to investigate the evidence of the presence of major genes and to determine whether costly genotyping of large numbers of animals for many DNA markers could be justified. In dairy goats, a major gene explaining about 60% of the total genetic variance of milk flow has been shown [7,13]. As mentioned earlier, milk flow is also an economically important trait in dairy cattle. Therefore, it would be useful to investigate if evidence of a major gene for milk flow in dairy cattle exists, similar to dairy goats. Furthermore, genetic parameters for milk flow have never been estimated for Swiss dairy cattle.
In a typical segregation analysis using pedigreed animal populations, complexity brought-in by the existence of many (inbreeding) loops makes the exact computations of marginal densities impossible. This problem has been simplified by the development of Gibbs sampling, a Monte Carlo Markov chain (MCMC) methodology [5] and the application of this methodology to livestock populations [19] and [8].
The main objectives of this study were to estimate genetic parameters of milk flow, and to investigate whether a segregating major gene affecting milk flow exists in three breeds of Swiss dairy cattle, using 4 to 22 years of milk flow data. A Bayesian segregation analysis using the Gibbs sampling method and a restricted maximum likelihood method were applied to achieve these objectives.

Data
The data available for this study consisted of the milk flow records of the three major dairy breeds in Switzerland (Brown Swiss, Simmental and Holstein). For both Brown Swiss and Simmental, the milking ability of a cow was measured as the quantity of milk in kg per minute of milking either in the morning or evening milking and only one milking was considered for the measurement, mostly during the first lactation. The cows measured for milk flow in later lactations were not selected and this depended on the breeders. Some breeders, who were not satisfied by the measurement of milk flow for a certain number of cows in the first lactation, repeated the test in second or later lactations. Other breeders measured the milk flow for the whole population in first and later lactations. For Holstein, based on the farmers information of the cows milking abilities, 15 classifiers attributed a score on the scale of 1 to 5 for milk flow of each cow: 1=very slow, 2=slow, 3=average, 4=fast and 5=very fast. Datasets from each breed were edited such that only herds that had at least 10 lactation records were included in the analyses. Brown Swiss: two data sets on milk flow corresponding to the first and multiple lactations recorded for 22 years between 1980 and 2002 were obtained. After editing on the herd size (≥10 records), there were 169 503 records from the first lactation, and 204 397 records from multiple lactations (first, second, third and later lactations). Simmental: data sets on the milk flow of the first lactation and multiple lactations recorded for 22 years between 1980 and 2002 were obtained. After editing on the herd size (≥10 records), there were 621 376 records from the first lactation, and 655 989 records from multiple lactations (first, second, third and later lactations). Holstein: milk flow scores measured for 4 years between 1999 and 2003 were collected. There were no sufficient records on second or later lactations to carry out analysis for multiple lactations. Therefore, only data from first lactations were retained and edited on the herd size (≥10 records). The final data set consisted of 40 242 from first lactation records.
For each breed and data set, all the pedigrees were traced as far back as possible for fitting (mixed inheritance and polygenic) individual animal model. The description of different data sets and pedigrees for all breeds in first lactation (FL) and multiple lactations (ML) are given in Table I.

Mixed inheritance model
A mixed inheritance model with fixed effects (non-genetic effects), random effects of polygenes, and the fixed effect of a single major gene were used to investigate the presence of a major gene in inheritance of milk flow.
The mixed inheritance models for first lactation and multiple lactation data sets for all three breeds are described as: First lactation: Multiple lactations: where y is the vector of observations, β is a vector of non-genetic fixed effects (Brown Swiss data: herd, year-season, parity, stage of lactation; Simmental data: herd, year-season, parity, breed, stage of lactation; Holstein data: herd, year-season, season-classifier, stage of lactation), u is a random vector of polygenic effects, pe is a random vector of permanent environmental effects, W is a matrix containing the genotype of each individual, m is the vector of genotype means, e is a random vector of residual effects, and X, Z and Q are incidence matrices relating the observations to their respective effects. For the Simmental breed association, the population consisted of crossbred animals of Simmental, Holstein and Montbéliarde breeds. The animals were then allotted to 6 breeds or sections according to their blood percentage and color. A description of different data sets (the number of levels of all effects included in the model) and the number of animals in the pedigrees of all the breeds are given in Table I.
The major gene was modelled as an autosomal biallelic (A and B) locus with Mendelian transmission probabilities. Allele A is defined to decrease the phenotypic value, and allele B is defined to increase the phenotypic value. In the presence of two alleles A and B, with frequencies p and q = 1 − p, where p is the estimate of A allele frequency in the founder population in which the Hardy-Weinberg equilibrium was assumed, three genotypes AA, AB or BA and BB can be encountered, with genotype means m = (−a, d, a), where a is referred to as the additive effect and d is referred to as the dominant effect at the major locus.
Distributional assumptions for polygenic effects were, u ∼ N(0, Aσ 2 u ), where A is the numerator relationship matrix. The distribution of the permanent environmental effects were, pe ∼ N(0, Iσ 2 pe ). Residual effects were assumed to be distributed as e ∼ N(0, Iσ 2 e ). σ 2 u , σ 2 pe and σ 2 e are polygenic, permanent environmental and residual variances, respectively. The relationship matrix of the full pedigree A was used in the analyses.
Uniform prior distributions were assumed in the range (−∞, +∞) for nongenetic effects and effects at the major locus, in the range (0, +∞) for variance components, and in the range [0, 1] for allele frequencies [8].
Gibbs sampling algorithm with blocked sampling of genotypes was used for inference in the mixed inheritance models (1) and (2) and implemented using the MAGGIC software package developed by Janss [10].
Twenty replicates of Gibbs chains of 100 000 cycles were run for each analysis, using a spacing of 50 cycles, obtaining 2000 Gibbs samples per chain and 40 000 samples in total for each analysis. A burn-in period of 1000 cycles was used to allow the Gibbs chains to reach equilibrium.

Polygenic model
The aim of fitting a polygenic model to first lactation data sets of three breeds was to obtain genetic parameter estimates of milk flow and to compare them with those obtained using the mixed inheritance model (to check the mode of inheritance of this trait).
Therefore, only the first lactation data sets of three breeds were analysed to estimate the genetic parameters by a restricted maximum likelihood methodology (REML) by an individual animal model, using VCE software [16]. The model used was a sub-model of (1) and specified as, y = Xβ + Zu + e, with all the model terms the same as defined in model (1). The parameters of interest for statistical inferences in this polygenic model were residual variance, σ 2 e polygenic variance, σ 2 u and heritability, h 2 = σ 2 u /(σ 2 u + σ 2 e ). Table II represents lactation-wise (1, 2 and 3+ lactations) means and standard deviations of milk flow for all breeds. The observed means for milk flow in both Brown Swiss and Simmental were higher in first lactation than later lactations which is in line with the findings of Dodenhoff et al. [3]. The mean milk flow was slightly higher in Brown Swiss than in Simmental. The distribution of Holstein milk flow score for the first lactations is plotted in Figure 1, which shows that a milk flow score of 3 was more frequent and presented more than 60% of the total data.

Polygenic model
The genetic parameters estimated by REML for milk flow in the first lactation and for the three breeds are presented in Table III.
The heritability of milk flow was high and very similar for Brown Swiss and Simmental breeds (0.46 and 0.48). In Holstein, milk flow score had an intermediate heritability (0.25). This is explained by the differences in scale on which milk flow of a cow was measured: in Brown Swiss and Simmental breeds, the milk flow was measured as the quantity of milk in kg per minute of milking (on a continuous scale) whereas, in Holstein the milk flow of a cow was a score trait, scored from 1 to 5. Genetic variance of discrete traits is usually lower in magnitude than those for continuous traits due to the loss of information when (continuous) data is truncated to a few classes or categories (e.g. Kadarmideen et al. [11]). This loss of information is true even for major gene and QTL effects on a continuous versus discrete (probability) scale [11].
The estimated heritability of milk flow for Brown Swiss and Simmental (0.46), was higher than the heritabilities reported by Sprengel et al. [20]. For Holstein, the heritability of milk flow score (0.25) was in accordance with other studies based on similar subjectively scored milk flow measurements [2,14,17].

Mixed inheritance model
Posterior means and standard deviations of parameter estimates of milk flow using Bayesian segregation analyses implemented by Gibbs sampling are presented in Tables IV and V for the first and multiple lactations, respectively.  These estimates are based on 40 000 Gibbs samples from twenty replicated chains. Tests for convergence of the Gibbs sampler showed that Gibbs samples of all the parameter estimates achieved a good stationary phase. For both analyses (first and multiple lactations), the standard deviations of the marginal posterior estimates were very small (Tabs. IV and V). As highlighted by Walling et al. [21], since the convergence of a Gibbs chain is determined by mixing properties of the parameters, large numbers of realisations were necessary to provide sufficient independent samples to estimate the posterior distributions of parameters. Miyake et al. [15] reported that increasing the number of Gibbs samples is an effective way of improving the precision of estimated parameters while precision also depends on the size of data. In this study, large numbers of cycles were used to ensure proper mixing and improve the precision of estimated parameters. Estimated genetic parameters of milk flow (in the first lactation) obtained by Bayesian segregation (mixed inheritance model) analyses were similar to those obtained under the polygenic model by REML (Tabs. III and IV).
According to Box and Tiao [1], the posterior density regions (HPD), based on a non-parametric density estimate using the averaged shifted histogram technique [18], were obtained for all model parameters. These highest regions were constructed to include the smallest possible region of each sampled parameter values. In all analyses, the highest posterior density regions at 95% (HPD 95% ) of the additive effect (a) and of the dominance effect (d) at the major locus included zero (Tabs. IV and V). The allele frequencies p and q in the three breeds were intermediate (0.47 to 0.51 and 0.53 to 0.49, respectively). Furthermore, the estimates of major gene variance (σ 2 m ) ranged from 0.0009 to 0.0011. Janss et al. [9] and Miyake et al. [15] also suggested the use of the magnitude of the major gene variances as an indicator for the existence of segregating a major gene. These results suggest that the postulated single major gene influencing milk flow has no significant effect and the higher heritability of this trait may be due only to polygene effects. These results are, however, in contrast to dairy goats where a major gene explaining about 60% of the total genetic variance of milk flow [7,13] was found. These differences in results could simply indicate that mixed inheritance of certain traits in livestock is species-specific or that the identified major gene in goats has been fixed in cattle.  Posterior marginal distributions of genetic parameters of milk flow in the first lactation and for the three breeds are shown in Figures 2, 3 and 4. For each parameter, a unimodal density was found with a mode very close to the corresponding marginal posterior mean (Tab. IV). Symmetric densities were observed for variance components and heritabilities, which were not the case for additive major gene effects. In the Bayesian approach with Gibbs sampling, posterior density can be summarised by one or more statistics. If the density is approximately symmetric, the posterior mean and standard deviation are considered to be appropriate for describing the density. In the cases of more complicated densities, the posterior mode is also a valuable third statistics to be considered.
Estimates for error variances and heritabilities in the first lactation were slightly larger than those estimates in multiple lactations (Tabs. IV and V) which are expected. This is because a part of the error variance component is estimated by the permanent environmental variance in the multiple lactation models (repeatability model), and therefore the error variance decreases. However, the polygenic and the phenotypic variance components generally remain constant. In first lactation, the error variance for Holstein of milk flow score was 3 times larger than the polygenic variance component. This is explained in part by the appraisal error due to 15 different classifiers.

CONCLUSION
Milk flow or milking speed in 3 breeds of dairy cattle in Switzerland was analysed by a Bayesian method and implemented via Gibbs sampling. Analysis was based on lactation records, ranging from 40 000 to nearly 3 / 4 million collected over the last 4 to 22 years. Based on the results from this study, it can be concluded that there is no significant major gene involved in the mode of inheritance of milk flow and that the trait is controlled in a purely polygenic manner in dairy cattle. This is in contrast to dairy goats where strong evidence for a major gene has been found [7,13]. The polygenic heritability of milk flow measured as the quantity of milk per minute of milking was around 0.44 and 0.47 for Brown Swiss and for Simmental, respectively. In Holstein, where the milk flow was subjectively scored, the heritability was 0.25. The polygenic repeatability of milk flow was 0.54 for Brown Swiss and 0.59 for Simmental. Incorporating milk flow in a selection programme could improve selection response for milking ability. However, before there is any use of milk flow in selection criteria, more research is needed to study the associations between other related traits such as clinical mastitis, udder and teat characteristics. The estimates of genetic and residual parameters from this study would be useful for a possible future implementation of genetic evaluation of milking ability in Swiss dairy cattle.