Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits

Background Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. Results BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in prediction reliability compared to the univariate versions. Conclusions Substantial improvement in prediction reliability was possible for most of the traits related to milk protein composition using our novel BayesAS models. Grouping adjacent SNPs into segments provided enhanced information to estimate parameters and allowing the segments to have different (co)variances helped disentangle heterogeneous (co)variances across the genome.

samples due to costly and time-consuming analytical techniques.
In livestock breeding, genomic selection has become a successful approach, especially for sex-limited traits, because it speeds up the selection process by reducing generation interval and enables to select new selection candidates at early ages. Accuracy of genomic prediction hinges on a number of factors including size of reference population, heritability of the trait, effective population size, marker density, and the genetic architecture of the trait (in particular, the number of loci that affect the trait and the distribution of their effects) [4][5][6]. Therefore, prediction accuracy for traits with limited records is still low. However, if the methodology used exploits information about the distribution of the loci that underlie a trait, traits that are controlled by a few quantitative trait loci (QTL) with major effects can be predicted with better accuracy than traits that have a more polygenic nature [7]. Several statistical models have been developed for genomic prediction using genome-wide single-nucleotide polymorphisms (SNPs), which include the Bayesian models (e.g. BayesA and BayesB) of Meuwissen et al. [8], the genomic best linear unbiased prediction (GBLUP) model [9] and several extensions of these models. Compared to GBLUP, the Bayesian variable selection models improve prediction reliability considerably for traits that are controlled by a few QTL with major effects [10,11]. This is mainly due to the assumption that, in the GBLUP model, the variance does not vary across the genome, i.e., it does not take heterogeneity over segments into account. Unlike GBLUP, Bayesian variable selection models allow the variance of SNP effects to differ among loci [9]. Genome-wide association studies have indicated that a few QTL regions underlie substantial proportions of the genetic variation in traits related to milk protein composition [12]. Hence, it is expected that, for traits related to milk protein composition, a model assuming SNP-specific variances in genomic prediction can result in higher prediction reliability than the GBLUP approach. However when the available dataset is small, as is the case for expensive-to-measure traits, reliable estimation of SNPspecific variances with the Bayesian approach becomes problematic since there are too many parameters to estimate relative to the information available. In such situations, Gianola et al. [13] suggested to group SNPs according to their common variance. Grouping adjacent SNPs might be advantageous for estimating variances reliably by enhancing the amount of information and reducing the number of parameters to estimate. Adjacent SNPs are very likely to be in linkage disequilibrium (LD) with the same QTL and to have the same variance, which allow us to account for heterogeneity between SNP groups. In this context, SNPs must be properly ordered and grouped such that they are realistically in LD with the same QTL while ensuring that their group size is optimum for the reliable estimation of variances.
Another option that is widely used to deal with traits with limited records is to implement multi-trait models, which simultaneously use information from related traits and individuals [14]. In multi-trait analysis, correlation structures between the traits is central to gaining any advantage in prediction reliability over single-trait predictions [15]. Milk protein traits have a low to moderate genetic correlation with routinely recorded traits such as total protein yield [2]. However, while the genome-wide correlation is generally low, specific genomic segments may display high genetic correlations between SNP effects for different traits. Therefore, modeling such heterogeneous covariance patterns may result in improved prediction reliability, when using multi-trait models.
In this study, we report genomic prediction reliabilities for traits related to milk protein composition using a relatively small set of cow data by developing novel Bayesian hierarchical models that account for heterogeneous variance structures across regions of the genome. Furthermore, we extend our novel Bayesian models to bivariate scenarios that model heterogeneous covariance structures between milk protein composition traits measured on cows and a large set of bull data with highly accurate deregressed proofs (DRP) for total protein yield.

Animals and phenotypes
Available data comprised two datasets: a relatively small set of cow data with information on traits related to milk protein composition and a large set of bull data with highly accurate total protein yields from regular milk recordings on daughters. Individuals in the two datasets were genetically related i.e., all the cows had their sires in the bull dataset.
Single morning milk samples were collected once from 650 Danish Holstein cows in 21 herds. Cows were sampled at different stages of lactation (days 9 to 481 in milk) and parity (1 to 4). Liquid chromatography/electrospray ionization-mass spectrometry (LC/ESI-MS) methods were used for profiling milk proteins. Details on the identification and relative quantification of milk proteins are in Jensen et al. [16]. We used these methods to quantify milk proteins, including α S1 -CN, α S2 -CN, β-CN, κ-CN, α-LA, and β-LG, posttranslational modifications of G-κ-CN and α S1 -CN-8P, as well as total protein percentage. In later analyses, β-CN was excluded from the genetic analysis due to very low estimates of its heritability across models (0.01 to 0.05), which made meaningful predictions difficult to obtain given the small sample size. DRP for milk protein yield were obtained from 5326 progeny-tested Danish Holstein bulls. Estimated breeding values from the Nordic genetic evaluation in January 2013 were used to derive DRP following the methodology described by Schaeffer [17].

Genotypes
Genotyping was performed using the BovineHD Illumina Beadchip for 372 cows or the BovineSNP50 Beadchip for the remaining 278 cows and all the bulls. SNPs that overlapped between these two genotyping arrays were combined and subjected to quality control. Quality parameters used to select SNPs were: (1) minimum call rates of 90% for individuals and 95% for loci and (2) exclusion of SNPs with a minor allele frequency (MAF) lower than 5%. Finally, 36,000 SNPs across the 29 bovine autosomes were available for the analyses.

Models
Hierarchical Bayesian models based on genome segments of different sizes (hereafter collectively called BayesAS models) were developed to predict genomic breeding values (GBV). Univariate and bivariate GBLUP models were used to compare performances of the novel Bayesian models.

GBLUP models
Univariate (based on cow data only) and bivariate (based on combined cow data and bull DRP) GBLUP models were implemented using DMU [18]. The general model used for the univariate analysis (ST-GBLUP) was: where y ijkl are the observations on trait i from cow l, in parity j, and herd k; μ i is the fixed mean effect for trait i; b i1 is the regression coefficient for DIM l in trait i, which is a covariate describing the effect of days in milk for each cow l; b i2 is the regression coefficient for the Wilmink adjustment (exp −0.05 * DIM l ) of days in milk for trait i; e 1ijkl is a random residual effect that is assumed to be normally distributed with e 1 ∼ N 0, I 1 σ 2 e 1 , where I 1 is an identity matrix with dimensions 650 by 650. The effect of g il is a random additive genetic effect for trait i of cow l with distribution N 0, Gσ 2 a , where G is the genomic relationship matrix between cows with dimension 650 by 650 and σ a 2 is the genetic variation in trait i.
To run a bivariate analysis (MT-GBLUP) of DRP on protein yield and each protein composition trait, DRP were modelled as: (1) where y DRP l is the DRP for bull l; and μ DRP is the corresponding fixed mean effect. g 2l is the random additive genetic effect for animal l for protein yield with distribution N 0, G 2 σ 2 a , where G 2 is the genomic relationship matrix for combined cow and bull population with dimension 5976 by 5976. Distribution of the vectors of the two animal effects in the bivariate models are as follows: where, in this case, g 1 is a vector of breeding values for all animals for one of the cow traits based on the covariance matrix G 2 unlike in Model (1); σ 1 2 is the genetic variance for each cow trait and σ 2 2 is the genetic variance for the bull DRP. The random residual effect e 2l , in Model (2), is assumed to be normally distributed with, e 2 ∼ N 0, I 2 σ 2 e 2 , where I 2 is an identity matrix with dimension 5326 by 5326 and σ 2 e 2 is the residual variation for bull DRP. In the bivariate analysis, the residual covariance for the pair of bivariate traits was set to zero because the observations came from different individuals. The distribution of the vectors of the two residual effects in the bivariate analyses can be described as: The genomic relationship matrix used in the GBLUP models was calculated as described in the first method presented by VanRaden [9].

BayesAS models
Models that were proposed initially by Janss [19] were implemented in the MCMC Bayesian framework of the Bayz program (www.bayz.biz). Adjacent SNPs were grouped into non-overlapping genomic segments and the (co)variance for each segment was estimated. Accordingly, six models were implemented, in which a genome segment was defined as: single SNPs or groups of 50, 100, or 200 adjacent SNPs, a complete chromosome or all the SNPs in the genome. The model that considers the whole genome as a segment can be considered basically as a GBLUP model implemented in a Bayesian manner.
Univariate (ST) and bivariate (MT) versions of the BayesAS models were implemented. For the ST-BayesAS models, each protein composition trait (y ijkl ) from the cow dataset was run as in the model described below: (3) Model components for fixed effects, covariates and random residual effects are in Model (1). Z is a matrix with SNP covariates (centered) with dimensions of the number of individuals (n = 650) by the number of loci (m = 36,000) and z l is a row of genotypes for animal l, a i is a vector of SNP effects for trait i, with length m and with elements a i = a ijk , such that a ijk is the effect of SNP k in SNP group j for trait i.
For the MT-BayesAS models, an additional model component was included to run the DRP on total protein yield from bulls (y DRP l ) simultaneously with each protein composition trait from cows. The following model was added to run the bivariate MT-BayesAS analyses: Z in the MT-BayesAS is a matrix with SNP covariates (centered) with dimensions of the number of individuals (n = 5976) by the number of loci (m = 36,000) and z l is a row of genotypes for animal l, a i is a vector of SNP effects for trait i, with length m and with elements a i = a ijk and the residual term (e 2l ), is as in Model (2). The index "i" here refers to both cow trait and bull DRP run in each bivariate analysis, for sake of simplicity in describing the models. SNP effects between each of the two traits in the bivariate analyses are correlated using latent variables by the following hierarchical model: where s 0 = s 0jk and s 1 = s 1jk are vectors of latent variables with length m, to model average covariance across SNP groups (s 0 ) and deviations within SNP groups (s 1 ) using nested regression; r 0i is a regression coefficient of s 0 for all SNPs and r 1ij is a regression coefficient of s 1 for each SNP group j; and a * ijk is the residual SNP effect, which is uncorrelated across traits. The latent variables in s 0 and s 1 are assumed to be normally distributed with a variance of 1: where I is an identity matrix with dimensions of number of loci (m = 36,000).
Distributional prior assumptions for the regression coefficients of s 0 and s 1 are: where U() stands for a uniform distribution across the given interval.
The residual SNP effect a ijk * is assumed to be normally distributed with a mean of 0 and SNP-group specific variance (σ 2 a * ij ) for which an inverse-Chi square distribution (4) was set with scale SC i 2 and degrees of freedom df i for all SNP effects in group j: The scale parameter SC i 2 is assumed to have a uniform distribution. The parameter df i is set so that the spread of the variances of individual SNP-groups around the common scale is controlled (here, a value of 5 was used).
Samples of the posterior distributions of the model parameters are obtained using MCMC techniques, i.e., sampling from conditional distributions. The conditional distributions for all parameters in Eqs. (3), (4) and (5) are normal and for variances are scale-inverse Chi squared. For the parameters s 0 and s 1 , which are present in the expectation for multiple SNP-effects, the bayz software automatically combines all parts of the likelihoods and combines them with the prior distribution to form the conditional posterior.
Za i from Models (3) and (4) computes the genomic values (g i ) at each MCMC cycle. The total explained genomic variance for trait i is computed as the variance of the genomic values from every MCMC cycle: The genomic covariance between the cow and bull traits can then be calculated as: where g cow is a vector of genomic values for all individuals for each cow-trait and g bull is a vector of genomic values of all individuals for total protein yield. Similarly, genetic values for the individuals at SNP group j (g ij ) was calculated at each MCMC sample based on the genotypes and estimated effects of SNPs in group j as: where Z j is a matrix of covariates for SNPs within group j, with size of number of individuals by number of SNPs at group j, and a ij is a vector of effects of SNPs at group j for trait i. Genomic variance for trait i at SNP group j was then calculated from these MCMC samples of individual genetic values as: The proportion of the genomic variance explained by segments was computed for each trait i as . The genomic covariance for each cow and bull trait at each SNP group j was then calculated as: (10) σ cowj,bullj = cov g cowj , g bullj .
Inferences were based on 500,000 Gibbs samples. The first 50,000 samples were discarded as burn-in, and every 500th sample was saved for post-Gibbs analyses. The mean of the variance and covariance terms, which are calculated in each MCMC iteration, is used later. Convergence was assessed using the R package CODA [20].
The BayesAS models presented in this study can be considered as extensions of the Bayes A model of Meuwissen et al. [8], which mainly differ in that estimates of variances are per SNP groups (segments) instead of per single SNP. In this case, taking one SNP as a segment might be considered as an approximation to the BayesA approach. However, there is still a difference in that the scale parameter of the χ −2 SC 2 i , df i prior for σ 2 a * ij is treated as unknown instead of being fixed. Moreover, the bivariate versions of BayesAS uniquely use latent variables to model covariances between traits.

Comparison of the predictive ability between models
A resampling strategy using cows in five test sets was implemented to compare models for prediction reliability. Our aim was to avoid sibling relationships between each test set and between the training and test sets. Hence, 197 cows, which had no siblings in the dataset, were selected. In each of the resampled analyses, 100 of the 197 cows were randomly taken for the test set, while the remaining 97 cows from each random sampling were included in the reference population of 550 cows. For all models, prediction reliability for cows was computed as the squared correlation between estimated GBV and the phenotype corrected for fixed effects as in Model (1), divided by heritability estimates [21] from the complete dataset of 650 cows using Model (1). Since the major practical implication of genomic prediction studies is to assess the predictive ability of models for young bulls with no phenotypic record, reliabilities of models in the MT-BayesAS analyses were computed for bulls using standard errors of predicted GBV using the following formula, as described by Mrode [22]: where SEP l is the standard error of prediction (posterior standard deviations from MCMC samples) of GBV for each bull based on its Gibbs samples for each protein composition trait; and σ i 2 is the total genomic variance calculated as in Eq. (6), which, as an approximation, was taken as the additive genetic variance. Model reliabilities were computed for all bulls, and the average was taken as the model reliability for the respective trait.
Further analyses were conducted using the Gibbs samples from the 100-SNP segment size MT-BayesAS model to assess prediction reliability when varying the proportion of segments, based on ranking of explained genomic variance. Prediction reliabilities were, accordingly, computed using the top 2% (8), 7% (26), 15% (56), 25% (93), 50% (186), or 75% (279) of all 372 genomic segments included in the analyses. Segments were ranked on estimated variance based on evaluation on the training data with all segments included. Reliabilities were computed for the test sets similarly as in the other BayesAS models and were used to compare the different scenarios. Table 1 presents heritability estimates for traits related to milk protein composition obtained with the ST-GBLUP model, their genome-wide correlations and covariances with total milk protein yield obtained with the MT-GBLUP model. Heritability estimates were high for κ-CN,

Prediction reliability of the GBLUP models
Prediction reliabilities were low for all traits (0.03 to 0.21) when using the ST-and MT-GBLUP models ( Table 2). Compared to the other protein composition traits, β-LG (0.21) and κ-CN (0.16) had the highest prediction reliabilities, whereas α S2 -CN and α S1 -CN-8P had the lowest (0.03) when using univariate analysis. There was a slight gain in prediction reliability for α S2 -CN and protein percentage when bivariate analysis was used. There was no improvement in prediction reliability for κ-CN, G-κ-CN, β-LG, or α S1 -CN-8P compared to ST-GBLUP predictions. Prediction reliability was a little lower with the MT-GBLUP model than with univariate prediction for α S1 -CN and α-LA.
Genome segment-wise variances for milk protein composition traits and covariance with total protein yield Figure 1 presents the proportion of genomic variance in milk composition traits explained by each chromosome using the ST-BayesAS model. Marked differences were observed in the proportion of genomic variance explained by genome segments across the traits. For some of the protein composition traits, a single chromosome explained up to or more than half of the genomic variance. For instance, Bos taurus (BTA) chromosome 6 explained 76, 63 and 47% of the genomic variance for κ -CN, G-κ-CN and α S2 -CN, respectively. Likewise, 40% of the genomic variance for β-LG was explained by BTA11 alone. Figure 2 shows the covariances between traits related to milk protein composition and total protein yield explained by genomic segments of 100 SNPs. Across the traits, some segments explained a large part of the covariance, whereas others accounted for nearly no covariance. Covariances between total milk protein yield and a particular trait were positive for some segments and negative for others. For G-κ-CN, κ-CN, β-LG, α S2 -CN, and protein percentage, a few segments showed peaks for the explained covariance. Segment 106, corresponding to a group of 100 adjacent SNPs on BTA6, explained a large amount of positive covariance of α S2 -CN, κ-CN, and G-κ-CN with total protein yield. Similarly, a sizable proportion of the covariance between β-LG and protein yield was explained by a single segment on BTA11. A segment on BTA14 explained a substantial part of the negative covariance between protein percentage and protein yield. The same segment showed a peak for the covariance between α S1 -CN-8P and total milk protein yield compared to the rest of the segments. Although some segments explained relatively more covariance between α S1 -CN and total protein yield and between α-LA and total protein yield compared to other segments, the actual covariance values explained by these segments were very low (note the difference in y-axis scales between plots in Fig. 2).

Prediction reliability with BayesAS models
Prediction reliabilities for cows using the BayesAS models were generally high compared to those obtained with the GBLUP models across all traits. Prediction reliabilities using both the MT- (Fig. 3) and ST-BayesAS models were generally high for most of the highly heritable traits, such as κ-CN, G-κ-CN, and β-LG, using different segment sizes. Using the 100-SNP segment size resulted in the highest prediction reliability for all studied protein composition traits in both univariate and bivariate versions of the BayesAS models. Prediction reliabilities using the 100-SNP segment size with the MT-BayesAS model were 0.76 for G-κ-CN, 0.68 for κ-CN, and 0.52 for β-LG. Expanding the segment size to include all SNPs on a chromosome or the whole genome resulted in the lowest prediction reliabilities with the BayesAS models. The performance of the whole-genome-based model was similar to that of the respective GBLUP models. With the MT-BayesAS model, improvement in prediction reliability reached 63% for G-κ-CN, 52% for κ-CN, 31% for β -LG, and 15% for α S2 -CN when using the 100-SNP-based Table 2 Prediction reliability from univariate and bivariate GBLUP models a Protein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms, i.e., α S1 -CN (comprises α S1 -CN 8P + 9P), α S2 -CN (comprises α S2 -CN 11P + 12P), κ-CN (comprises κ-CN G 1P + unglycosylated κ-CN 1P), where P = phosphorylated serine group. G-κ-CN = glycosylated-κ-CN; α S1 -CN-8P = α S1 -CN with 8 phosphorylated serine groups  Fig. 1 Proportion of genomic variance explained by each chromosome. Proportion of the genomic variance in the milk protein composition traits explained by each chromosome from the ST-BayesAS model taking chromosomes as segments model compared to the whole-genome-based model. Prediction reliabilities were low for α S1 -CN, α-LA, and α S1 -CN-8P for all BayesAS models and improved minimally by using the 100-SNP-based model compared to the whole-genome approach. The 50-and 100-SNP models performed similarly well for β-LG. However, for the other proteins, the 100-SNP model outperformed both the 50-and 200-SNP based models, which generally showed comparable results. Using the single-SNP segment size resulted in lower performance compared to the 50-, 100-, and 200-SNP-based models for all traits. Prediction reliabilities computed for β-LG and protein percentage using the single-SNP-based MT-BayesAS model were better than when each chromosome (by 13 and 1 percentage points) or the whole genome was used as the segment (by 18 and 2 percentages points), respectively.
In general, slight additional gains in prediction reliability were achieved using the MT-BayesAS models compared to the univariate BayesAS model (Table 3), i.e., 6 and 5 percentage points for G-κ-CN and κ-CN  Fig. 2 Covariance between each protein composition trait with total protein yield explained by 100-SNP genomic segments using 100 SNP-segments and the average improvement with this segment size was 3 percentage points. However, improvement in prediction reliability from the MT-BayesAS models declined when the whole genome was taken as segment, which resulted basically in similar performances than the ST version except for β-LG. Table 4 shows the reliabilities of the MT-BayesAS models for bulls with segments of different sizes. Prediction reliability computed for the cow datasets was higher than that for bulls for G-κ-CN while the reverse was found for α S2 -CN. Higher model reliabilities were computed for bulls for α S2 -CN, κ-CN, G-κ-CN and β-LG with the 50-and 100-SNP segments compared to the other MT-BayesAS models. On the contrary, prediction reliability did not vary much across models for α S1 -CN, α-LA, α S1 -CN-8P and protein percentage, which had relatively low reliabilities. Prediction reliabilities obtained from the MT-GBLUP model were similar to those from the genome-based MT-BayesAS model for all traits and hence are not presented in Table 4. Fig. 3 Prediction reliability across MT-BayesAS models. Reliability of models according to segment sizes of 1, 50, 100, and 200 SNPs, chromosome, and whole genome. G-κ-CN = glycosylated-κ-CN; α S1 -CN-8P = α S1 -CN with eight phosphorylated serine groups Table 3 Prediction reliability from univariate and bivariate BayesAS models a Protein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms,i.e., α S1 -CN (comprises α S1 -CN 8P + 9P), α S2 -CN (comprises α S2 -CN 11P + 12P), κ-CN (comprises κ-CN G 1P + unglycosylated κ-CN 1P), where P = phosphorylated serine group. G-κ-CN = glycosylated-κ-CN; α S1 -CN-8P = α S1 -CN with 8 phosphorylated serine groups Prediction reliabilities with selected genome segments Figure 4 shows prediction reliabilities according to the proportion of selected 100-SNP segments used in the prediction. Using fewer segments that explain large proportions of the variances resulted in higher predictive ability for G-κ-CN, κ-CN, β-LG, α S2 -CN, and protein percentage. For these traits, prediction reliability using only 2% (8) of the top-ranked segments resulted in the highest reliability, whereas prediction reliability decreased as more segments were added. In contrast, prediction reliability increased as more segments were added for α -LA, α S1 -CN, and α S2 -CN-8P, with the highest reliability obtained when all segments were used for prediction.

ST-GBLUP versus MT-GBLUP models
Using only the cow dataset with the GBLUP model resulted in low prediction reliability, due to the small Table 4 Model reliability for bulls across the MT-BayesAS models a Protein composition expressed as a fraction of the total milk protein percentage by weight wt (wt/wt), protein % expressed as percentage of the total milk yield; individual proteins comprise only the peaks identified as intact proteins and isoforms,i.e., α S1 -CN (comprises α S1 -CN 8P + 9P), α S2 -CN (comprises α S2 -CN 11P + 12P), κ-CN (comprises κ-CN G 1P + unglycosylated κ-CN 1P), where P = phosphorylated serine group. G-κ-CN = glycosylated-κ-CN; α S1 -CN-8P = α S1 -CN with 8 phosphorylated serine groups  size of the training dataset. Reference population size is a key factor that affects reliability of genomic prediction in cattle [4,7,23]. Moreover, a small sample size may not sufficiently reflect the genetic variability. For instance, considering a subset of the cow dataset used in this study, Poulsen et al. [24] showed that the genetic variation of the CSN1N1 gene was very low in Danish Holstein, with most individuals having the BB genotype, which may explain the lower prediction reliability for α S1 -CN and its sub-fraction α S1 -CN-8P. Although information on total protein yield from a large number of bulls was added when using the MT-GBLUP model, prediction reliabilities were as poor as, or even worse than, those in the univariate analysis. Thus, addition of information from total milk protein yield was not sufficient to offset the computational burdens of the bivariate analyses, due to the low genome-wide correlation between protein yield and composition traits. Among the milk proteins, the highest genome-wide correlation with total protein yield was measured for α S2 -CN (− 0.16) and protein percentage (− 0.14), for which the MT-GBLUP model resulted in slightly improved prediction reliabilities for cows and bulls. Although α-LA had a correlation of 0.15 with total protein yield, the standard error of the correlation was higher than the correlation estimate (0.16). Although the data used was limited, our findings on genome-wide correlations were comparable to results from previous studies. In the literature, genetic correlations between milk protein percentage and protein yield in different dairy cattle breeds are low, in general [25][26][27].
Moreover, all the bivariate analyses in our study involved combination of data on different scales, which may have influenced the computed reliabilities. DRP for milk protein yield were expressed on a lactation basis (305-day production), whereas protein composition traits and percentage were related to one morning milk sample. In our study, prediction reliabilities for the traits related to milk protein composition traits were expected to improve if both traits in the bivariate analyses were on a similar scale.

Predictive ability of BayesAS models
Prediction reliabilities from the resampling showed large improvements with the ST-and MT-BayesAS models compared to their GBLUP counterparts. The BayesAS models allow for different variances and covariances by SNP groups, which can deviate from the genome-wide (co)variance. This was especially important for some traits for which one or two key segments alone explained a large part of the total variance. Grouping adjacent SNPs seems to have helped to obtain more reliable estimates from a small dataset while allowing the segments to have different variances that disentangled heterogeneous (co) variance patterns and improved prediction reliability. Similarly, a simulation study by Shariati et al. [28] showed that prediction reliability based on SNP grouping was better than that obtained by SNP-BLUP methods. SNP grouping in the study of Shariati et al. [28] was based on similar effect sizes. Other grouping options also exist, e.g. depending on LD between SNPs [29]. The BayesAS models can also be used to implement such grouping strategies for which segment sizes might vary depending on LD or effect size similarity.
Prediction reliability with the BayesAS models appears to depend highly on the segment sizes considered and the genetic architecture of the traits. Comparison between the BayesAS models with different segment sizes showed that grouping 100 adjacent SNPs resulted in superior performance for all proteins. Grouping 50 SNPs was as predictive as the models based on 100-SNP segments for all traits except G-κ-CN for which prediction reliability improved by 9 percentage points with the 100-SNP segment size model. Taking each SNP as a segment resulted in lower prediction reliability than groups of 50, 100, or 200 adjacent SNPs for most traits. With our BayesAS models, prediction reliabilities decreased as segment size increased beyond 100 SNPs in both the univariate and bivariate analyses. The lowest reliabilities were obtained when considering each chromosome or the whole genome as segments. In other words, the (co)variance between segments was diluted as segment size increased beyond 100 SNPs. Similarly, Brøndum et al. [30] reported that using a segment size of 100 SNPs resulted in the highest accuracy in an across-breed genomic prediction study for protein, fat, and milk yield using 465,000 SNPs. Defining the optimal segment size, in terms of number of adjacent SNPs, is critical to achieving meaningful gains from the novel models presented here. Optimal segment size should be established for each specific situation, for instance through some resampling strategy, considering the SNP array, species, and LD in the population.
The gain in prediction reliability from using different segment sizes in the BayesAS models also varied across the traits. In both the ST-and MT-BayesAS models, differences in prediction reliability between segment sizes were very large for G-κ-CN, κ-CN, α S2 -CN, and β-LG, whereas across all models they were smaller for α S1 -CN, α S1 -CN-8P, or α-LA. These results are likely related to the genetic architecture of the protein composition traits investigated. Previous genome-wide association studies found that the proportions of κ-CN, α S2 -CN, and β -LG in milk are controlled by major QTL on BTA6 and 11 [12], which carry the casein gene cluster and the gene encoding β-LG [31], respectively. On the one hand, a single chromosome could explain a very large proportion of the variance for some protein composition traits, including G-κ-CN, κ-CN, β-LG, and α S2 -CN, which showed the largest improvement in reliability when the heterogeneity of variances across the genome segments was accounted for. On the other hand, the proportion of explained variance by each chromosome was very small for α S1 -CN and α-LA, which indicates that many segments contribute small proportions to the average variance. Similarly, Buitenhuis et al. [31] found no major region that was significantly associated with α S1 -CN in the Danish Holstein population, which could be associated to the low genetic variability of the CSN1N1 gene reported for this population by Poulsen et al. [24]. This result indicates that SNP grouping is more useful for traits that are controlled by QTL with major effects.
Comparison between the univariate and bivariate versions of our BayesAS models showed that for the most informative traits, the MT version resulted in further improvements in prediction reliability of up to 6 percentage points for segment sizes of 100 and 50. While further improvements in prediction reliability of up to 6% from the MT-BayesAS over the univariate versions are still important, it was generally lower than expected. Further investigations are required to understand the impact of genetic architecture of the indicator trait(s) on the potential advantages, over univariate analysis, of our bivariate BayesAS models.
A few segments explained a substantial proportion of the genomic variance for traits related to milk protein composition and their covariance with protein yield. Thus, we investigated the reliability of predictions based on only a few of the best-explaining 100-SNP segments. Predictions based on only 2% (8/372) of the genome segments resulted in the highest prediction reliability for G-κ -CN, κ-CN, β-LG, and α S2 -CN. For these proteins, prediction reliability decreased as more segments were added. Inclusion of more segments that explained a smaller proportion of the (co)variance added noise rather than meaningful information. Similarly, in a simulation study based on a GBLUP approach, Sarup et al. [32] demonstrated that including non-causal markers led to dilution of the effect of causal markers and reduced predictive ability. For other protein composition traits, including α S1 -CN-8P, α S1 -CN, and α-LA, prediction reliability improved as more segments were included, with the highest prediction reliability being obtained when all segments were considered. This result is in agreement with our finding on the proportions of genomic covariance explained by 100-SNP segments, where many segments across the genome contributed small proportions of the average covariance between these traits and total protein yield. In this study, we have used the same dataset to rank the top segments and do the prediction. This could lead to overestimation of reliability and introduce prediction bias. However, such bias is expected to be minimal as the SNP effects in these top segments are re-estimated for prediction with the different proportion of segments.

Conclusions
A novel BayesAS model, which allows exploring and modeling heterogeneous variance and covariance patterns across genomic regions, improved prediction reliabilities for milk protein composition traits with a small dataset compared to the GBLUP and single-SNP based Bayesian models. The number of adjacent SNPs grouped together affected prediction reliability for the BayesAS models. A segment size of 100 SNPs gave the highest prediction reliability using 36,000 SNPs spread across the genome. For the most informative traits (highest genomic reliability), a further gain in reliability was observed when using the bivariate versions of our BayesAS models compared to univariate counterparts. Our results also show that the gains in prediction reliability achieved by SNP grouping depend on the genetic architecture of the traits. A future study with simulated data would be useful to test our novel BayesAS models with larger datasets.