A novel method for the estimation of the relative importance of breeds in order to conserve the total genetic variance

The need for conservation of farm animal genetic resources is widely accepted. A key question is the choice of breeds to be conserved. For this purpose, a core set of breeds was introduced in that the total genetic variance of a hypothetical quantitative trait was maximised (MVT core set). For each breed the relative contribution to the core set was estimated and the breeds were ranked for conservation priority according to their relative contribution. The method was based on average kinships between and within breeds and these can be estimated using genetic marker data. The method was compared to a recently published core set method that maximises the variance of a hypothetical population that could be obtained by interbreeding the conserved breeds (MVO core set). The results show that the MVT (MVO) core set favours breeds with a high (low) within breed kinship that are not related to other breeds. Following this, the MVT core set method suggests conserving breeds that show a large difference in the respective population mean of a hypothetical quantitative trait. This maximises the speed of achieving selection response for this hypothetical selection direction. Additionally, bootstrap based methods for the estimation of the breed's contribution to the core sets were introduced, substantially improving the accuracy of the contribution estimates.


INTRODUCTION
On a world-wide level, there are roughly 6000 breeds of 30 domestic mammalian and bird species. Around 35% of them are classified as having a high risk of extinction and every week two breeds permanently vanish [9] resulting in an irreversible loss of animal genetic resources. The need to conserve these resources is widely accepted mainly because it can be seen as an insurance against future challenges and conditions but also for ecological and sociocultural reasons. Because funds to preserve animal genetic resources are limited, an optimal allocation of these funds is of central importance. Within this framework a key question is the choice of breeds for conservation programmes. Ruane [14] reported several criteria for this selection, such as specific adaptive features, particular traits of special interest and genetic uniqueness. This last feature is aimed at maintaining the genetic variance, an aspect upon which we will focus exclusively throughout this paper.
Pairwise genetic distances and their graphical representation in distance trees are common tools used to assess the genetic uniqueness of a particular breed within a set of breeds. Genetic distances are usually estimated from the genotypic information of a set of neutral loci. Weitzman [16,17] developed an algorithm for the estimation of the genetic diversity within a set of elements based on pairwise distances under the assumption that all elements are distinct and are obtained from a single founder population by fission. The application of this algorithm to a set of breeds makes it possible to rank the breeds for their priority for conservation according to their contribution to the Weitzman diversity. This diversity measure has a number of nice mathematical and biological properties [15,16] and was used recently in several breed genetic diversity studies [2,11,13]. However, as pointed out by Eding and Meuwissen [3] and Caballero and Toro [1], the use of the Weitzman diversity measure on a within-species breed level might be inappropriate because it ignores migration between breeds, which is unrealistic, and also ignores within breed diversity.
Instead of genetic distances, Eding and Meuwissen [3] and Caballero and Toro [1] used average kinships between and within breeds for the description of genetic diversity. Kinship is defined as the probability that two gametes randomly drawn from a population are identical by descent. Following this, the average kinship between two breeds is an estimate of the fraction of alleles that these breeds have in common. In order to prioritise breeds for conservation, Eding et al. [4] defined a core set that is built by relative contributions of the breeds under consideration in order to minimise the mean kinship in this core set. The core set maximises the variance of a hypothetical quantitative trait that can be found in a hypothetical population obtained from interbreeding the conserved breeds [4]. The breeds are ranked for their priority for conservation according to their relative contribution to the core set. Eding and Meuwissen [5] described a method that estimates average kinships between breeds using similarities of genetic marker alleles. By way of simulations, the authors demonstrated that the accuracy of the estimation of the breed's contribution to the core set from estimated kinships is only moderate, indicating that there is scope for improvement. However, the core set method of Eding et al. [4] does not explicitly consider the variance that can be found between breeds. An analogue of the core set method for prioritising breeds for conservation has been presented by Caballero and Toro [1]. Using a similar approach, Piyasatian and Kinghorn [12] defined genetic diversity as the amount of allelic variation that can be found within and between subdivided breeds. A breed is preferred for conservation if it contributes significantly to the total allelic variation. The conceptual differences between the core set diversity method and the approach of Piyasatian and Kinghorn [12] on the one hand and the Weitzman diversity on the other hand, is that the former methods consider both the between and within breed variation and they account for possible migration between breeds. Fabuel et al. [7] showed that the two approaches can produce different results.
The aim of this paper was to put forward a conservation criterion that values the differences between breeds more than the core set method of Eding et al. [4] does. For this purpose an algorithm was introduced that estimates the relative contribution of breeds to a core set in order to maximise the total additive genetic variance of a hypothetical quantitative trait. The relative contributions were used to rank the breeds for conservation priority. The method was compared to the core set method of Eding et al. [4] using simulated and real data. Additionally, bootstrap based methods were introduced, substantially improving the accuracy of contribution estimates.

Core set methods
Assume a population with n animals and each animal i has a breeding value u i for a hypothetical quantitative trait. The additive genetic variance within the population is: where var(u i ) is the variance of the breeding value u i of animal i and var(u) is the variance of the population mean of the breeding values. The averaging is because individual animals may have different var(u i ). Ignoring the total additive genetic variance of the trait, σ u 2 , since it multiplies the results by a constant, yields: where A denotes the average of the elements of the numerator relationship matrix, A, with dimension n × n. A i is the diagonal element i of A, this element corresponds to one plus the inbreeding coefficient of animal i. The elements of the numerator relationship matrix are twice the elements of the kinship matrix [8]. In the following, this outline is transferred from a single breed level to a multiple breed level. Assume a set S of N breeds with a known average kinship matrix, M, of dimension N × N as described by Eding and Meuwissen [3,5]. The off-diagonal elements of M are the average kinships between breeds and correspond to the inbreeding coefficient of putative offspring from the corresponding between breed mating. The diagonal elements of M are the average within breed kinships and correspond to the inbreeding coefficients of putative offspring from within breed mating. Following (1), the additive genetic variance of a hypothetical quantitative trait within set S can then be described by where M i is the within breed kinships of breed i obtained from the diagonal elements of M and M denotes the mean of all the elements of M. A one is added to M i , because the A i elements in (1) correspond to one plus the inbreeding coefficient. In order to maximise the total genetic variance, a core set S mvt is formed by relative contributions of the N breeds of S. This core set is termed maximum variance total (MVT) core set method in the following. The contributions of the breeds to the MVT core set are stored in a vector, c mvt , of dimension N. The total genetic variance conserved by the MVT core set, Var total (S mvt ), is then calculated by: with F being a vector of dimension N that contains the within breed kinship, i.e. F = diag(M). In order to find the solutions for c mvt under the restriction c mvt (i) ≥ 0 and with 1 N a vector of dimension N containing ones and λ as the Lagrangian multiplier. In order to maximise (4), the first derivative with respect to c mvt is set to zero and solved for c mvt : Because c mvt 1 N = 1 it can be written: Solving for λ yields: The solution for c mvt is obtained by substituting (6) into (5): This core set method is compared to the core set method of Eding et al. [4], which maximises the variance in a hypothetical population that could be bred from the conserved breeds, i.e. the offspring variance. The Eding et al. [4] core set method will be termed the maximum variance offspring core set method (MVO). The MVO method minimises the average kinship within the core set, f min (S mvo ). In brief, the relative breed contribution vector, c mvo , is calculated under the restriction c mvo (i) ≥ 0 and The solution for c mvo is: The offspring genetic variance maximised by this core set method is calculated by: See Eding et al. [4] for the full derivation of this core set method. Under the assumption that all breeds are descendants from a single and noninbred base breed, both genetic variances, the total variance and the offspring variance, can be seen as the variance relative to the variance of the base breed. Both core sets use the same average marker kinship matrix M, whose estimation is described below.

Estimation of the average kinship matrix M from genetic marker data
Ideally, the average kinship matrix M is calculated using pedigree information, but often no appropriate pedigree information is available. In these cases M can be estimated using molecular marker information [1,3,5]. Here a method of Eding and Meuwissen [5] was used and is described in the following. A similarity index between all pairs of individuals genotyped for a marker k (S xy,k ) was calculated as S xy,k = 1 4 [I 11 + I 12 + I 21 + I 22 ], where I i j is an indicator variable which is 1 when allele i in the individual x and allele j in the individual y are identical, and otherwise it is 0. The average similarities between breed i and j for locus k ( S i j,k ) were estimated by: where n i j is the number of combinations of individuals between breed i and j.
where f i j is the average kinship between breed i and j and s k is the probability of the alleles being Alike In State (AIS). By subtracting both sides from one and taking the natural logarithm of this, it is possible to set up a log-linear model for the estimation of kinships from marker data [5]: where e i j,k is the error term. In matrix notation the model becomes where y is a vector with the observations and X a and X b are incidence matrices relating the observations to the effects a (a i j = ln(1− f i j )) and b (b k = ln(1− s k )) stored in the vectors a and b, respectively, and e is the vector of errors. The observations show the following expected variance [5]: Taking these variances into account a weighted log-linear model was formulated using the following equations: where W is a diagonal matrix that contains the corresponding expected variances of the observations. Because the weights are obtained from the estimates, the final solutions were estimated iteratively (100 iterations). The design matrices showed dependencies and a generalised inverse of the coefficient matrix was used, with the consequence that the solutions were not unique. Therefore, the solutions to a were restricted by setting the highest a i j (i.e. the smallest kinship) to zero. This was aimed at defining the breed that existed just before the first fission event as the base breed. After solving this system, the kinships were calculated by back transformation, i.e.: where f is a vector (dimension 0.5N(N + 1)) containing the kinships. These kinships were finally transferred into the average kinship matrixM. See Eding and Meuwissen [5] for a more detailed description.

Contribution vector estimation fromM
The breed contribution vectors to the MVT and MVO core sets (i.e.ĉ mvt andĉ mvo ), respectively, were calculated fromM andF (F = diag(M)) as outlined in (7) and (9). If breeds showed negative contributions, the most negative contribution was set to zero and the contribution vector was recalculated without the corresponding breed. This was repeated until no further negative contribution estimates were observed. We will refer to this method of contribution vector estimation as the Eding et al. method (EEA), because it is analogous to the idea of Eding et al. [4].

Contribution vector estimation fromM by a bootstrap approach
As stated in the introduction, the accuracy ofĉ mvo is only moderate when estimated by EEA [5]. In order to obtain a higher accuracy of the contribution vectors for both core set methods (ĉ mvt andĉ mvo , respectively), a nonparametric bootstrap method was tested as described in the following. A bootstrap sample was generated by performing two sampling steps. In one step, N individuals were sampled with replacement out of the pool of N original individuals within breed. In the second step, L marker loci were sampled with replacement out of the pool of L original marker loci across breeds. Hence, a bootstrap sample consisted of sampled individuals and sampled marker loci. Throughout the study 100 bootstrap samples were generated (B = 100). For each bootstrap sample, average similarities were calculated using (11) and the log-linear model was set up using (12) and (13) to obtain the kinship of the bootstrap sample. However, it was not possible to run the weighted log-linear model with the expected variances obtained from (14), because for each bootstrap sample around 100 iterations would have to be performed -an undertaking that would be computationally too demanding. The similarities from the bootstrap samples were therefore analysed using two different methods.
The first method used model (13) and the equations of (15) but calculated the solutions without weights (i.e. omitting matrix W from the equations in (15)). The solutions were restricted such that the highest a was zero (a restriction is needed in the fixed effects equations (15)) and the kinships were estimated from the solution vector using (16) and stored in the kinship matrix. The breed contribution vectors of the bootstrap sample to both core set methods were obtained using (7) and (9). As for the EEA method, negative contributions were subsequently set to zero, starting with the most negative one, and the contribution vectors were recalculated without the corresponding breed. The final solutions of the vectorsĉ mvt andĉ mvo were obtained from the mean of the solutions from the B bootstrap samples. This method of contribution vector estimation is termed the bootstrap method (BM).
The second method again used model (13) and the equations of (15) but used the reciprocal value of the empirical variance of the observations. This variance was estimated one time by bootstrapping as shown in the appendix, and then was used for the analysis of all bootstrap samples. The final solutions of the vectorsĉ mvt andĉ mvo were obtained as described for the BM method. This method of contribution vector estimation is termed the weighted bootstrap method (WBM).

Application to simulated data
To test and to compare the methods outlined above, two series of Monte Carlo simulations were carried out. The number of replicates was always 10 (test simulation showed that this number is sufficient to obtain reliable results). In the first series, a phylogenetic situation was simulated over 50 generations. A base breed was simulated that consisted of 50 individuals, the size of which was kept constant throughout the simulation. For each individual, a number of L unlinked genetic marker loci was assumed (L = 10 and 20, respectively) and the alleles were randomly assigned to the individuals. The number of alleles per locus in the base breed was randomly chosen from the interval 25/50. From a number of test simulations it was found that this interval would produce realistic numbers of alleles still segregating after 50 generations, although in exceptional cases fixation occurred. Each next generation was generated by randomly assigning sires and dams from the current generation as parents of the individuals of the next generation. New breeds were randomly generated by base breed fission between generations 10 and 49 by sampling sires and dams randomly as parents of the individuals of the new breed. The effective size of the new breed was randomly chosen from the interval 24/76 and was kept constant throughout the simulation. For each breed the male/female ratio was one and an even number of breed sizes was therefore chosen. In total, 15 breeds were simulated for each replicate, one base breed and 14 breeds that were formed by fission from the base breed.
During the second series of simulations the phylogenetic structure was fixed. A base breed of size 50 was bred for 50 generations as described above (L = 20). From this, four new breeds were generated at generation 10 by fission as outlined above. In order to obtain different within breed kinships after 50 generations, the effective size of the breed was chosen to be 10, 20, 30 and 40, respectively. The male/female ratio was one.
In generation 50 the genotypic data of the breeds were used for the estimation of the breed contribution vectors for both core sets by use of the EEA, BM and WBM methods as outlined above. In addition, the full pedigree information was recorded during the simulation and this was used to estimate the true average kinships between the breeds using path analysis [8]. The calculated true kinships were corrected in order to define the breed that existed just before the first fission as the base breed [5]: This transformation was done in order to make these kinships comparable to the marker estimated kinships obtained from (15) and (16). From the true average kinship matrix, the true breed contributions were obtained using (7) and (9). The correlation between the estimated and the true contribution vectors and the mean square error of the estimated contributions was calculated, serving as an empirical measure of the ability of EEA, BM and WBM to estimate correct breed core set contributions. The total variance and the offspring variance within the two core sets were estimated using (3) and (10) and using the correct kinships. Additionally, for the second series of simulations the individual contributions of the five breeds to the core set, averaged over all replicates, were recorded. This revealed the differences between the MVT and the MVO core set methods regarding the breed contributions as a function of their within breed kinships and regarding the variance conserved, respectively.

Application to real data
To compare the two core set methods using a real case, the field data of Eding and Meuwissen [5] were reanalysed. The data set consisted of 10 Dutch breeds that were genotyped for 11 microsatellite markers and are described in detail by Eding and Meuwissen [5]. Unfortunately only allele frequencies were available. Therefore, the similarity between population i and j at locus k with n alleles was calculated as described by Eding and Meuwissen [3]: where p ik,l is the allele frequency of allele l at locus k in population i. The breed contribution vectors for both core set methods were calculated by EEA and BM. The bootstrap samples could only be generated by sampling the loci (instead of sampling the loci and the animals), because no information regarding the animals was available. For this reason it was also not possible to apply the WBM. Note that the estimation of the empirical variance of the observations required the bootstrap sampling of the individuals (Appendix). To show the relationship between the breeds, genetic distance between breed i and j was computed asd i j =f ii +f j j − 2f i j [3]. These distances can be interpreted as twice the Nei minimum distance corrected for the allele frequency in the base breed. They were represented graphically in a dendrogram using the Neighbour-Joining algorithm of the Phylip software [10]. For comparison purposes, the Weitzman diversity was estimated using the genetic distances obtained and the software of Cañón and García as applied in [2]. The breeds were ranked for conservation priority using the estimated contributions to the total Weitzman diversity measure.

Results from the simulations
As expected, when comparing the variance conserved by the MVT and by the MVO core set methods, respectively, the MVT method conserved more total variance and the MVO method more offspring variance. However, the differences between the two core set methods are only minimal (Tabs. I and II). This is regardless of the phylogenetic situation simulated.
The accuracy of the WBM contribution vector is the highest in both core set methods. These contribution estimates showed the highest average correlations with the true contributions and on average the lowest mean square error (MSE of the WBM estimates roughly 0.01 lower compared to the BM and 0.02 lower compared to the EEA estimates, Tabs. I and II). Most total variance in the MVT core set and most offspring variance in the MVO core set were conserved, respectively, when the contribution vectors were estimated by WBM. Generally, as expected, the accuracy of the contribution vector estimates was a function of the number of loci included in the analysis. The WBM produced estimates with a significantly higher accuracy in phylogenetic situations with 10 loci than the EEA with 20 loci (Tab. I). The accuracies of the contribution vector estimates are higher in the MVT core set compared to the corresponding estimates of the MVO core set (Tabs. I and II).
All breeds contributed to the MVO core set (results from the true contributions, Tab. I), but when the contributions were estimated with EEA, many breeds erroneously showed a zero contribution estimate. This was also reported by Eding et al. [4]. This problem was less marked when the BM method was used and did not occur when the WBM method was applied. In contrast, a substantial fraction of breeds did not contribute to the MVT core set (results from the true contributions, Tab. I), but nearly all breeds showed a contribution estimate greater than zero, when using BM and WBM. However, these estimates were in general only very small.
In Table II, the within breed kinships of the five simulated breeds (results from the second series of simulation, fixed phylogenetic situation and fixed and unequal population size) are given as an average over all replicates. Note that all between breed kinships were very similar and low (in general < 0.01), Table I. Correlation (Corr) between the estimated and the true core set contributions, mean square error (MSE) of contribution estimates and actual conserved total genetic variance (Var total ) and offspring genetic variance (Var offspring ) as functions of the core set, of the number of simulated loci (L) and of the method of contribution estimation. Results from the first series of simulation, averaged over all replicates. 1 n zero number of breeds that showed zero contribution to the core set; 2 Var total and Var offspring , actual conserved total genetic variance and offspring genetic variance estimated by the use of the contribution vectors and the true kinship matrix.
because of the correction in (17). The kinships between breeds become constant after fission [3], and in this series of simulations all breeds were formed by fission at the same time. The breeds with low kinships contributed more to the MVO core set. This was expected, because this core set method is aimed at keeping the average kinship of the set as small as possible [4]. In contrast, the true contributions of the breeds with a higher kinship are slightly larger in the MVT core set. This could not be observed from the estimated contributions, probably due to sampling errors.
In the two initial series of simulations, we sampled both the animals and the loci during the bootstrap approach. In order to investigate which sampling is responsible for the improved performance of the bootstrap approach, an additional simulation was carried out. The method is the same as that used in the Table II. Mean square error (MSE) of contribution estimates, actual conserved total genetic variance (Var total ) and offspring genetic variance (Var offspring ) as functions of the core set and of the method for contribution estimation, and contribution estimates as a function of the within breed kinship ( f i ).    first two series of simulations with the exception that either animals only or loci only were bootstrapped. The results are presented in Table III and show that the improvement stems mainly from the sampling of the loci. It seems that sampling of the animals only did not generate enough bootstrap variance to obtain significantly improved contribution estimates.

Results from field data
From the estimated kinship matrix from the field data (Tab. IV) it can be seen that the Heck breed shows the highest level of within breed kinship followed by Galloway, Dutch Black Belted and Improved Red Pied. This is also visualised in the dendrogram of the genetic distances obtained from the kinships (Fig. 1), where these breeds show the longest branch length. The highest mean kinships were also observed for these three breeds. The Holstein Friesian Figure 1. Neighbour-Joining dendrogram of the 10 cattle breeds obtained from genetic distances derived from the between breed kinships. In parentheses the ranking number for prioritisation for conservation obtained by the total variance core set diversity (MVT), by the offspring variance core set diversity (MVO) and by the distance based Weitzman diversity (W). The results of the BM bootstrap approach (Tab. V) were used for the two core set rankings. and Limousine breeds contributed considerably to the MVT core set, despite their low within breed kinships and thus their short branch length. This was due to their comparatively low mean kinship, and consequently these two breeds were also the major contributors to the MVO core set (Tabs. IV and V). The third breed of the top three contributors was different in the two core sets. The Heck breed showed the largest contribution to the MVT core set but only a small contribution to the MVO core set. The high MVT contribution of the Heck breed is due to its high within breed kinship and its comparatively low mean kinship. The opposite was true for the Dutch Red Pied (low contribution to MVT and high contribution to MVO core set). As a result of the estimation of the Weitzman diversity, the conservation priorities of the breeds according to their contribution to the Weitzman diversity are given in Figure 1. As expected, the priority ranking was in agreement with the branch lengths of the corresponding breed.
Four breeds showed zero contributions to the MVT core set using EEA and only one using BM (Tab. V). However, the EEA zero contribution breeds received only a very small (between 0.002 and 0.03) value using BM. A similar Table IV. Estimated kinships within and between the 10 cattle breeds using a weighted log-linear model.

Breed
(1)  pattern can be observed in the results of the MVO core set (Tab. V). EEA estimated five, but BM no zero contributions. However, in this core set the EEA zero contribution breeds received considerable contributions when using BM (between 0.02 and 0.05). When the breeds were ranked according to their contributions estimated either by EEA or by BM, there was a general agreement in the order within the core set. In both core sets only the fourth and fifth breeds were changed in their order due to small differences in the contribution estimates estimated by EEA and BM, respectively.

DISCUSSION
This study introduces the MVT core set method for prioritising breeds for conservation in order to optimally maintain total additive genetic variance of a hypothetical quantitative trait. The method uses kinships between and within breeds and for their estimation some of the ideas of Eding and Meuwissen [3] and Eding et al. [5] were implemented. The following discussion focuses on the differences and similarities of the MVT core set and the MVO core set proposed by Eding et al. [4] and Caballero and Toro [1] and on the applied methods for the breed core set contribution estimation.

MVT versus MVO core set
The differences between the two core set methods are demonstrated by the following simple hypothetical example. Consider two unrelated breeds A and B, and a breed AB are available for conservation, where AB is obtained by crossbreeding of A and B. The MVO method will not give extra value to A and B when AB is already conserved, whilst the MVT method would give extra value to A and B. The latter is because, although A and B contain no alleles that are not present in AB, they may contain genotype and allele combinations that are not present in AB, because in AB the alleles of A and B are mixed up. Thus, it may be easier to find particular genotype and allele combinations when A and B are conserved instead of AB, i.e. the genetic variance is more accessible in a set containing A and B compared to a set containing only AB.
The differences between the two core sets become obvious when focussing on the individual breed contributions. From the optimisation term (formula (3)) it follows that the MVT core set prioritises breeds with a high within breed kinship and a low average between breed kinship. In contrast, the MVO core set method favours breeds that show a low average kinship both within and between breeds. This was shown by the results of the second series of simulations (Tab. II). Considering the breed contributions to the two core sets from all simulated configurations and all replicates, the Spearman rank correlation was around 0.7 (averaged over all replicates), indicating that the two core sets prioritise different breeds only to some extent. From formulae (3) and (10), it follows that this extent is also a function of the differences of the within breed kinships of the breeds.
Assuming that the dendrogram in Figure 1 corresponds to a phylogenetic tree, it then appears that the MVO core set method prioritises breeds that are as close as possible to the base breed whereas the MVT core set approach prefers breeds that have drifted further away from the base breed. In terms of allele frequencies, the MVO core set avoids extreme frequencies and therefore, maximises the possible directions of selection. The MVT core set prioritises breeds with more extreme frequencies and thus prioritises breeds that already show different combinations of genotypes. Hence, the MVT core set method aims to conserve breeds that show large differences in the respective population mean of a hypothetical quantitative trait. This makes the MVT core set method attractive, because the efficiency of upgrading a breed by introducing genetics from another breed is a function of the difference in the respective population means. The MVT core set thus enables a faster reaction on putative changed conditions compared to the MVO core set.
The conceptual differences between the two core set diversity methods on the one hand and the Weitzman diversity on the other hand is that both core set methods do account for the within breed variance and do account for migration between breeds. The ignorance of these two aspects was seen as the main point of criticism of the application of Weitzman diversity on a within-species breed level [1,3,4], but in the conservation of farm animal genetic resources we are acting on this level. The numerical results of the field data analysis demonstrate the differences between the two core set diversity methods and the Weitzman diversity (Fig. 1). Whereas the latter focuses exclusively on the between breed variance, and hence prioritises breeds for conservation that are most distantly from the other breeds, the MVT core set method also accounts for the within breed variance. Indeed this is the reason why the MVT core set gives substantial weights to the Holstein Friesian and the Limousine breeds, but not the Weitzman diversity (Fig. 1). Nevertheless, Eding et al. [4] showed that the nice mathematical and biological properties of the Weitzman diversity [15,16] also hold in the MVO core set, if the 'monotonicity in distance property' (i.e. diversity should increase as any of the distances increase) is replaced by a kinship argument: the offspring variance in a set of breeds increases when the kinship between or within breeds decreases. Obviously, if the inbreeding in any of the breeds increases (i.e. distances increase), the total variance will be increased in the MVT core set method. Accordingly, the MVT core set method fulfils all of the original Weitzman criteria.
Simianer et al. [15] developed formulae for the estimation of marginal diversities of breeds using the Weitzman unit of diversity together with breed extinction probabilities. The marginal diversity of breed i is defined as the change of the conserved diversity when the extinction probability of breed i is changed by one unit [15]. With the use of marginal diversities it is possible to compare different conservation strategies. The authors state that it is possible to also use other diversity units as long as these fulfil the Weitzman criteria [15]. Therefore, the estimation of the marginal diversities is also possible in the core set methods given that estimates for the extinction probabilities are available. In this case, marginal diversities could replace the relative breed contributions as criteria for conservation priority. The advantage of this prioritisation is that the extinction dynamics of the breeds would be considered. Furthermore, Simianer et al. [15] suggested an approach of how diversity can be combined with other conservation criteria like e.g. special trait characteristics in order to generate a more comprehensive conservation criterion. Their outlines also seem to be valid for the diversity units considered by the two core set methods. Further work is needed to demonstrate the expansion of the presented core set methods to the ideas proposed by Simianer et al. [15].
The diversity approach of Piyasatian and Kinghorn [12] uses allelic variation as a criterion for conservation. Allelic variation does not account for the extra variance due to inbreeding (i.e. due to a combination of alleles), and therefore this method is closer to the MVO core set method rather than to the MVT core set method. However, Piyasatian and Kinghorn [12] recognise that a higher between breed variance is valuable for conservation and therefore gives a five times higher weight to the between breed variance compared to the within breed variance.
Throughout this study we focussed on the selection of breeds for a conservation plan rather than on the management of conserved breeds in order to maintain genetic variance. Generally, this management is a question of the selection of individuals as parents for the next generation and a question of the mating scheme of these individuals. It seems advisable to mate the breeds selected for the conservation plan only within breed in order to maintain the combinations of alleles and genotypes within the breeds, and hence, to conserve the between breed variance.

Accuracy of the contribution vector estimation
All three methods for the contribution vector estimation used marker based kinships obtained by a log-linear model as proposed by Eding and Meuwissen [5]. The advantage of this model is the simultaneous use of all available information to correct the average similarities for the AIS probabilities, and hence to obtain the kinships. It was shown that this model produces kinship estimates that show a high accuracy with a small upward bias [5]. Nevertheless, in the same study, the authors found that the accuracy of the contribution vector estimates is only moderate, which is due to the sampling errors in the kinship matrix, despite the high accuracy of the kinship estimates. In order to obtain contribution vectors with a higher accuracy, two bootstrap methods were tested and the results of the simulations (Tabs. I and II) clearly show the superiority of the bootstrap based methods (BM and WBM, respectively) over EEA in both core sets. Not only were the correlations higher and the mean square errors lower but the number of breeds with an estimated zero contribution was also closer to the real value. Remember that the true zero contribution values in the MVT core set received in general only a very small estimate when using the bootstrap based methods; from a practical point of view these values are negligible. See for example the contributions of the Belgian Blue and of the Dutch Friesian breed to the MVT core set (Tab. V). The large fraction of false zero contribution estimates was one of the main problems reported by Eding et al. [4] in contribution vector estimation. It now seems that the use of the bootstrap based methods provides a solution to this problem. The reason is that for each contribution estimate a distribution was generated by bootstrapping, and because of the recursive nature of the optimisation algorithms (i.e. the most negative estimate was set to zero and the contribution vector estimation was repeated without the corresponding breed), the lower bound of the bootstrap distribution was zero. Thus, if at least one contribution estimate of the distribution showed a positive value, the final contribution estimate was small but positive. The significantly better performance of WBM over BM emphasised the advantage of weighting the observations in the loglinear model accordingly. Eding and Meuwissen [5] found that the benefit from the use of the weighted log-linear model is due to the fact that less informative loci receive a lower weight. The present study shows that instead of using the expected variance (formula (14)), it is also possible to use the empirical variance estimated by bootstrapping as shown in the appendix.
The higher accuracy of the MVT contribution estimates compared to the corresponding MVO estimates indicates that the MVT core set algorithm tends to be less sensitive to sampling errors in the estimated kinship matrix. The optimisation function of the MVT core set (formula (3)) probably shows a more pronounced extreme than the corresponding function of the MVO core set (formula (8)), hence rendering it easier to find.

CONCLUSION
The introduced core set method (MVT) prioritises breeds for conservation by maximising the total genetic variance for a hypothetical quantitative trait in the core set. It was shown that the numeric results of this core set approach and of the MVO core set approach of Eding et al. [4] are to some extent similar. The differences were most clearly shown by the results of the field data analysis. The MVT core set approach suggests the conservation of breeds that show comparatively large differences in the respective population mean of a hypothetical quantitative trait. This maximises the speed of achieving selection response for a putative changed breeding objective, which makes the MVT core set method attractive. For the estimation of the core set contribution vectors from molecular marker data, we recommend the use of the weighted bootstrap approach (WBM), because this method produced the most accurate estimates, regardless of the core set method.
the Netherlands, for the permission to use the field data, Herwin Eding for preparing the field data and Javier Cañón and David García for the Weitzman algorithm software. This study was prepared while J.B. was on leave at the Agriculture University of Norway in Aas. He was supported by a grant from the German Academic Exchange Service (Deutscher Akademischer Austauschdienst, DAAD).