Managing the risk of comparing estimated breeding values across flocks or herds through connectedness: a review and application

Comparing predicted breeding values (BV) among animals in different management units (e.g. flocks, herds) is challenging if units have different genetic means. Unbiased estimates of differences in BV may be obtained by assigning base animals to genetic groups according to their unit of origin, but units must be connected to estimate group effects. If many small groups exist, error of BV prediction may be increased. Alternatively, genetic groups can be excluded from the statistical model, which may bias BV predictions. If adequate genetic connections exist among units, bias is reduced. Several measures of connectedness have been proposed, but their relationships to potential bias in BV predictions are not well defined. This study compares alternative strategies to connect small units and assesses the ability of different connectedness statistics to quantify potential bias in BV prediction. Connections established using common sires across units were most effective in reducing bias. The coefficient of determination of the mean difference in predicted BV was a perfect indicator of potential bias remaining when comparing individuals in separate units. However, this measure is difficult to calculate; correlated measures such as prediction errors of differences in unit means and correlations among prediction errors are suggested as practical alternatives.


INTRODUCTION
Best linear unbiased prediction (BLUP) can be used to partition records of animal performance into genetic and environmental components [11]. Environmental influences including effects of management unit (flock or herd) are accounted for by fitting them as fixed effects, potentially allowing genetic merit of animals born in different management units to be equitably compared. However, additive genetic differences among management units will also be attributed to environmental differences unless there are sufficient genetic connections among the units. Connectedness, in a statistical sense, relates to the estimability of contrasts involving model fixed effects [37]; a data set is connected if all contrasts among fixed effects are estimable. However, connectedness is not required in order to predict random breeding values [4], and disconnected subsets of records do not lead to biased predictions of breeding values so long as breeding values of base animals (i.e., the animals present at the start of performance recording) are randomly and identically distributed across the entire population [41]. This assumption is violated, however, if selection or genetic drift occurs before pedigree and performance recording begin and cause genetic means of the units to differ.
The likelihood of observing differences in genetic means among units depends on the extent of gene flow. Sheep and beef cattle populations consist of many subpopulations, partially isolated by geographical distance and sources of purchased seedstock. Analyses of genetic differences among herds and flocks have suggested that genetic variance among units may be as large as that within units [3,24]. Similar results were reported in dairy cattle herds [39] before widespread use of artificial insemination (AI). Gene flow via AI can create extensive genetic connections among units, but AI is currently not widely used in sheep; fewer than 2% of ewes in the USA are bred using AI [28]. At the very least, there is greater risk of bias when comparisons are made across flocks and herds when connectedness is poor.
In order to manage bias when comparing animals across units, either the source of the bias must be incorporated into the genetic evaluation model or tools to quantify the risk of comparing animals across units must be established and used. Biases arising from differences in unit genetic means can be eliminated by predicting breeding values using models that include genetic groups [31]. These models estimate mean breeding values for base animals from each management unit, which accounts for mean differences among units. Genetic connections among units are still required, however, in order to simultaneously estimate genetic group and environmental unit effects with acceptable accuracy [15,40].
Several statistics have been developed to assess the quality of across-unit connections [6,17,20]. An ideal statistic would provide insights into the potential risk of incorrect selection decisions associated with biased breeding value predictions. Connectedness statistics could also be used to design breeding programs which would effectively link management units.
The objectives of this article are to: (1) review the incorporation of genetic groups into breeding value prediction models and discuss some of the problems with their implementation; (2) examine the importance of connectedness and weigh the merits of various statistics to quantify connectedness; and, (3) suggest methods that could be used to increase connectedness and, thereby, reduce bias when comparing animals from poorly connected subunits. The last objective is addressed using small sire-model examples, with particular reference to sheep breeding programs.

Genetic groups model
Genetic groups were initially used in sire evaluation to account for differences in mean breeding values of bulls owned by different AI bull studs when relationships among bulls owned by the different studs were not available [16]. Pollak and Quaas [30] added genetic groups of base animals to the mixed model equations for sire models [32] and calculated sire effects as a weighted sum of genetic group effects plus the animals' genetic deviations from this expectation. This method of grouping base animals was extended to animal models by Robinson [33] and Westell et al. [44].
The general linear mixed model including genetic groups [31] is: where y is a vector of phenotypes, b is a vector of fixed effects, g is a vector of fixed base animal genetic group effects (due to unit of origin), u is a vector of random genetic effects expressed as a deviation from the expectation of each animal's genetic group, and e is a vector of residuals. Incidence matrices X and Z relate phenotypes to specific combinations of fixed and random genetic effects, respectively, and Q specifies the expected proportion of genes in each animal arising from the various genetic groups. In Q, base animals have a 1 in the column corresponding to the group in which they originated and 0 otherwise; descendants of base animals have coefficients which sum to 1.0 and describe the fractional contribution of each genetic group to their ancestry. The assumed distribution of random effects in this model is: where A is the numerator relationship matrix, σ 2 a is the additive genetic variance, and σ 2 e is the residual variance. Estimates of g and predictions of u are obtained as solutions to the resulting mixed model equations: with: M = I − X(X X) − X to adjust for (or absorb) fixed effects included in b and where λ is the variance ratio σ 2 e /σ 2 a . Predicted breeding values (û G ) are functions of estimated genetic group effects and random predictions of breeding value deviations soû G = Qĝ +û. Estimability of group differences depends on whether genetic groups are connected across levels of other fixed factors such as management unit, year or season.
Many genetic evaluations utilize the simpler linear mixed model: where y, b, e, X, and Z are as previously defined and the random breeding values, u r , are no longer expressed as deviations from genetic group means. The distribution of random effects for this model is: This model assumes that base animals are randomly sampled from a common population. The mixed model equations used to predict breeding values in this reduced model are: and do not account for fixed genetic differences in unit means. The error variance of genetic predictions in models that include or exclude genetic groups is of particular importance. In a model without groups, the prediction error variance (PEV) of predicted breeding values is a function of the inverse of the coefficient matrix: In a model with groups, breeding values are a function of both fixed genetic groups and random genetic deviations, and the PEV of breeding values is: is a generalized inverse of the partitioned coefficient matrix in (3), then from [12] the PEV in (6) is: The accuracy of estimation of fixed group effects can have a large impact on the accuracy of genetic evaluation. If the PEV of random breeding values as a deviation from group means (C 22 σ 2 e ) is of similar magnitude to the PEV of breeding values in a model excluding genetic groups (5), the PEV of the breeding values in a groups model is increased by (QC 11 Q +C 12 Q +QC 12 )σ 2 e where C 11 σ 2 e is the error variance of fixed group effects. Unlike the situation for random genetic deviations in which PEV must be less than the additive variance, the error variance of fixed genetic group effects may approach σ 2 e when the number of observations per group is small or connections are poor. Accuracy of evaluation under this model thus strongly depends on the number of animals in each group and the connectedness among groups.

Comparison of models with and without genetic groups
In order to compare alternative models, Kennedy [15] derived the expectation of bias in sire evaluation when a model without genetic groups was used but group differences exist. The mean square error (MSE) from the model without genetic groups equals the sum of the squared bias and the PEV of the breeding values and could be compared to the MSE from a groups model (which is equal to the PEV since there is no bias) to determine which model is preferred for a given data set. When there were only two genetic groups, the model without genetic groups had lower MSE as long as the true genetic difference between groups was less than the standard error of the difference in group solutions from the groups model. These results cannot be extrapolated to data sets with more than two genetic groups, but this comparison can be used as a criterion for whether or not to include groups in genetic evaluation models.
The bias from fitting a model without genetic groups when fixed genetic group effects exist was derived by Foulley et al. [6] for an animal model [44].
Using expectations in (2) and the vector of breeding values (û r ) from (4): The bias inû r is then: The magnitude of this bias is a primary consideration when deciding whether to fit a groups model. If genetic groups are not fit, connectedness among units will reduce bias in EBV differences between animals born in different units, even if the units originally had different genetic means. The bias in EBV differences between animals in separate units will always be less than or equal (when units remain disconnected) to the original difference in unit genetic means. Fitting genetic groups eliminates bias but increases the prediction error of breeding values (8). Therefore, if the squared bias from fitting a model without groups is lower than the increased prediction error from fitting a model with groups, a model without genetic groups may be preferred. If minimization of MSE is the goal, both Tong et al. [40] and Kennedy [15] imply that connections between subunits or regions are critical to lower standard errors of group solutions or potential bias when genetic groups are not included in the model. Both studies recommend that genetic groups not be fit until connections have been established and suggest reciprocal semen exchange involving 25 to 50% of the matings in two management units for best results in a single generation. As stated by Foulley et al. [6] relative to genetic evaluation without groups, "the bias removal ability of a model cannot be discussed irrespective of the degree of connection".
Bias and PEV of prediction under alternative models are both sensible criteria in choosing whether or not to fit genetic group effects, as long as base animals can properly be assigned to groups. Unfortunately, specifying the origin of base animals can be difficult, especially under extensive management with mainly natural-service matings. For instance, sheep flocks that participate in genetic evaluation programs often purchase sires from flocks that do not participate in the program, but in insufficient numbers to allow accurate estimation of genetic group effects for these flocks. Base animals purchased from other flocks may likewise be mistakenly assigned to the flock in which they first appear, and complex grouping strategies may lead to unsuspected confounding with other fixed effects such as birth year [31]. Given this issue, and the fact that connectedness is relatively low in most sheep flocks, genetic grouping by flock of origin is probably not a currently viable option in sheep evaluation in many countries. Instead, unbiased prediction of breeding values relies on the capacity of genetic connections among units to properly account for effects of flock of origin.
Similar complications in defining genetic groups can be envisioned in beef cattle evaluations; the extent would vary among breeds depending on the history of recording and the extent of AI. However, if programs for genetic evaluation in commercial herds [42] and across breeds [8] using data from crossbred animals expand, grouping issues will become more important. Current use of genetic groups in beef cattle evaluation generally consists of grouping base animals by time of entry into the data set to accommodate breed-wide genetic trends (Simmental; R.L. Quaas, personal communication) or grouping base animals according to their breed.

Comparing and using animals across subpopulations
Schemes to facilitate comparisons of animals from different management units generally involve either: (1) direct comparisons in central testing stations; or (2) use of reference sires across management units with subsequent statistical elimination of unit effects. In the USA, numbers of test stations have declined, and those that remain do not have capacity or opportunity to compare animals from all possible production units. Due to cost, differing environmental factors among stations (with possibility of genotype by environment interactions), and non-random selection of candidates, test stations alone cannot provide accurate comparisons of animals with different origins [1], and programs that promote germplasm exchange will be required.
A balance in progeny numbers among sires must be achieved when utilizing reference sires to provide connections among units. Reference sires must have enough progeny to permit accurate comparisons of animals across units, yet sires born within units must have enough progeny to allow an accurate progeny test. Large units have an advantage in that both of these goals can be more easily achieved [14]. If lowering the PEV of young sires produced within units is the main objective [14,36], use of reference sires in the breeding program may be counterproductive and will necessarily increase the PEV of the young sires [36].
Individual animal PEV is not a sufficient measure of risk in comparing animals across units and does not reflect potential bias in models that exclude genetic groups or increased error associated with fitting genetic groups. A better criterion to optimize numbers of progeny for reference and unproven sires in group breeding schemes is the PEV of comparisons between animals (or groups of animals) from different units. When the PEV of differences between units is used to indicate connectedness, large numbers of reference sire progeny (20 to 45% of the total number of progeny produced) were required to accurately compare animals across units [5,26].
Connectedness among flocks or herds allows producers to identify animals that are potentially better than their own. However, producers are often hesitant to purchase seedstock from other sources in order to establish connectedness, even though several studies have documented the benefits of increasing connectedness, primarily through group breeding schemes such as sire referencing schemes. These schemes involve an agreement by breeders to mate a predetermined portion of their females to a common set of selected males. Simulation studies have shown that cooperative sire referencing schemes can improve genetic gain by 30 to 35% compared to within-unit selection programs while also improving accuracy of comparisons between units and slowing inbreeding [10,23,34]. Genetic differences among units are not required for increased gain if the number of breeding females is low in some of the member units [23] because the likelihood of producing extreme individuals is smaller in small (< 100 female) units. Also, selection intensity can be dramatically increased by selecting animals across all flocks in the scheme.
Miraei Ashtiani and James [27] and Hanocq et al. [10] both showed that if management units differ in average genetic merit, the rate of genetic change increased most rapidly when the units first became connected as units with lower mean breeding values increased use of animals from units with higher mean breeding values. As a result, overall genetic gain in the system improved, with higher average gains in units with low initial genetic merit. Both studies suggest that average genetic gain across all units will subsequently slow over time as units become homogenized. However, if differences between units were minor and the units were large (> 300 progeny/yr), rates of gain were relatively unchanged by implementation of sire referencing schemes [27].
Smith and Banos [38] analytically predicted genetic responses from combined selection across and within units. Their results were in agreement with conclusions of simulation studies. If units are small, combined selection increases potential for genetic gain. If units differ in initial genetic mean, poorer units will catch up with better units after a few generations due to Table I. Examples of establishing whether 2-factor (management unit by sire) data designs are connected using a pathway method (boxes). Each cell contains a count of the number of progeny (n) of a sire i in unit j. homogenization. These results assume that producers adopt a common breeding objective and continue to participate in the scheme once it becomes clear which flocks or herds are superior.

Measuring connectedness
For fixed effects, determination of connectedness involves assessing the estimability of linear functions of fixed effects in n-way crossclassifications [4,29,43]. For two factors, this may be achieved by "tracing" a perpendicular path between nonzero cells in a two-way table (e.g., the X Z matrix), as demonstrated in Table I. In the connected set, sire 3 is connected to sire 1 because each is directly compared to sire 2. However, in the disconnected set, sires 1 and 2 are neither directly nor indirectly compared to sires 3 and 4.
Foulley et al. [6] were the first to develop a continuous, quantitative measure of connectedness. The authors' goal was to develop a measure ranging from 0 to 1 in which the two extremes represent either a completely balanced data set or one with at least two disconnected subsets. To measure connectedness in a vector of contrast coefficients, x, they proposed a connectedness index (IC; [22]): where C R is a portion of the inverse coefficient matrix in some "reduced model" (e.g., C uu in (5)) and C F is a portion of the inverse coefficient matrix for the same data from some "full model" (e.g., in C 22 (7)). The reduced model would be formed by removing some set of factors (e.g., groups) from the full model. Foulley et al. [6] hypothesized that more connected data sets are more nearly orthogonal. If two factors are orthogonal to each other, no bias is introduced by removing one of the factors from the model. If a random factor is completely orthogonal to a fixed factor removed from a reduced model, PEV is the same for the random effects in both the full and the reduced models, quadratic forms based on the full and reduced inverse coefficient matrices will be equivalent, and will equal one. This statistic does not account for the amount of information (i.e., number of progeny records) in the analysis. In addition to IC(x), Foulley et al. [6,7] developed a statistic (γ) to measure connectedness for an entire design: where n is the column rank of incidence matrices for C R and C F . This ratio of determinants of inverse coefficient matrixes of reduced and full models was developed using the Kullback-Leibler [19] distance between the joint density of the maximum likelihood estimators of all the effects in the full model and the product of the marginal densities of the effects removed from, and remaining in, the reduced model. If the marginal densities were orthonormal to one another, their product would be equal to the joint density of both sets of effects, and the Kullback-Leibler distance would be zero. Like, γ equals 1 if effects removed from the full model are orthogonal to effects remaining in the reduced model. Foulley et al. [6,7] suggest evaluating γ using the inverse coefficient matrix of genetic group effects in models with and without some set of nongenetic fixed effects (e.g., herd). This measure is undefined if some genetic group differences are not estimable since C F cannot be calculated. The value of γ increases as cross-classification between groups and other fixed effects improve. While orthogonality of data is desirable, Laloë [20] argued that a measure of precision was more appropriate in determining whether animals could be compared across different units and proposed the coefficient of determination (CD) for a breeding value contrast vector (x) as a measure of precision: where C uu is the random effects portion of the inverse coefficient matrix for a model without genetic groups (5). The CD of a contrast between animals or sets of animals in different management units would then provide a measure of their connectedness. Laloë [20] also developed two overall measures of connectedness using the ratio of quadratic forms in (10) and relating them to eigenvalues (µ i ) and eigenvectors (c i ) resulting from the solutions of: The number of eigenvalues is equal to the number of breeding values being predicted. The smallest eigenvalue will always be zero; other eigenvalues correspond to all possible independent contrasts. The proposed statistics [20,22] are functions of these eigenvalues: Like IC and γ, these statistics range from 0 to 1 with low values indicating low precision in comparing animals across fixed-effect classes. If more than one eigenvalue is zero, indicating that at least one contrast is uninformative, ρ 2 will be zero. These statistics have generally been applied to models without genetic groups [9,22]. The authors argue that as contrasts between animals in different units become more precise, the genetic mean difference between the units is better estimated. The vector of contrasts used in CD(x) could be the average of the breeding values in one unit minus the average of the breeding values in another unit; CD will be zero if the mean difference is not estimable. A third connectedness statistic based on the coefficient matrix was proposed by Kennedy and Trus [17]. They contended that the MSE of prediction of differences between candidates for selection was the most logical measure of connectedness. This MSE could be calculated from both a genetic groups model (PEV) and a model without groups if differences between genetic groups (PEV plus squared bias) are known, as discussed earlier in this article. If genetic group effects are negligible and therefore excluded from the model, the PEV of a contrast (x) is: PEV(x) = x C uu x σ 2 e . Unlike other proposed measures, PEV(x) is not restricted in range, but is closely related to CD [20]. However, CD may be easier to interpret because it is restricted to a range of 0 to 1 and scaled by the change in the additive variance of the true breeding values (x Ax) in the contrast due to relationships among animals involved in the contrast.
These three sets of measures (IC or γ; CD or ρ 1 and ρ 2 ; and PEV(x)) are the primary theoretical connectedness statistics from the literature.
Laloë et al. [22] evaluated the merits of all three approaches for a model without genetic groups using analytical criteria to determine which had the most favorable properties for evaluating connectedness. Properties of IC and γ were evaluated when the reduced model included random genetic effects only and the full model included fixed effects of management unit. The authors defined connectedness between random effects by stating that a random factor is disconnected when at least one contrast between its levels (i.e., animal breeding values) has a null CD. Using a small example, they show that neither PEV(x) nor IC can exhibit complete disconnectedness. The PEV(x) approach gives different results than the CD method because the reduced variability in true breeding values due to relationships is not accounted for in the contrast. The authors state that PEV(x) can be thought of as a measure to test the null hypothesis that the contrast (x) is zero, while the CD measures the power in testing whether the contrast is different than zero. The γ statistic is never null when calculated as in this study. Both C R and C F are always positive definite because random effects are always estimable. The authors show that under certain data structures, γ is highest when there is a minimal amount of data and decreases as the number of progeny per sire increases. Values of γ and IC equal one when data are perfectly balanced. This situation may be desirable in early stages of genetic evaluation, but it is impossible to make genetic progress and maintain this balanced condition since every sire would have to be equally represented in every contemporary group. The CD measures, on the other hand, account for both the amount of information in the data and its structure. The authors caution that designing programs to increase connectedness by increasing CD or IC, or by lowering PEV(x), can decrease genetic progress due to lower selection intensity.
The conclusions of Laloë et al. [22] are helpful in evaluating connectedness statistics. The IC and γ statistics would necessarily indicate decreasing connectedness as selection occurs within the system since only a sample of individuals will be chosen as parents. Both favor balanced data, and may be useful in early stages of genetic evaluation when the objective is to compare genetic means of different management units by exchanging sires. The PEV(x) and CD methods can give different results when comparing animals between units if there are related animals in both units, but will probably lead to the same general conclusions regarding connectedness. In fact, all of these connectedness measures have been shown to be highly correlated in field data [13]. The overall connectedness measures (γ or ρ 1 and ρ 2 ) may be useful for group leaders or scientists overseeing genetic evaluation programs, but are of little use to individual producers who are trying to increase connectedness to other units in the system. Calculation of these statistics requires all elements of the inverse coefficient matrices and thereby requires extensive computing time for large-scale genetic evaluations.
Several alternative statistics have been proposed to decrease the computing time required to assess connectedness. Kennedy and Trus [17] suggested using the variance of differences between estimates of the environmental effects of management units (e.g., herds), which was highly correlated to the average PEV of differences between animals in different herds in a small example data set. Bunter and Macbeth [2] developed this idea further by evaluating the variance of estimated differences in genetic group effects when fitting a model that included genetic groups. Their extension to genetic groups is sensible given the relationship of the variance in group differences to the MSE mentioned by Kennedy [15], but it relies on fitting a model with genetic groups, which may be problematic given difficulties in assigning base animals to genetic groups.
Recognizing that the prediction error covariance (PEC) between two animals' predicted breeding values would be zero if they were not connected, Lewis et al. [24] proposed the correlation of breeding value prediction errors (r i j ) as a pairwise connectedness statistic: whereû i is the estimated breeding value of the ith animal. They suggested averaging this statistic for all pairs of animals in different management units to evaluate connectedness between units. Mathur et al. [25] proposed a similar correlation statistic, the connectedness rating (CR), to measure connectedness but replaced prediction error (co)variances of breeding values in (11) with error (co)variances of management unit estimates (ĥ i ): This measure was less dependent on herd size than the variance of the difference in herd effects of Kennedy and Trus [17]. Other connectedness measures involving counts of direct links between test station groups [35] or management units [41] have also been suggested, but the statistical properties of these measures are strongly dependent on data structure. It is difficult to determine which connectedness measure is most easily understood and useful to individual producers. In general, no level of sufficiency has been determined for connectedness statistics. The MSE [15] quantifies risk, but cannot be calculated in practice under a non-groups model because the bias due to potential differences in genetic means among units is unknown. Yet, it is the risk of bias due to these differences in genetic means that must be addressed when making across-unit selection decisions. This risk is a function of the magnitude of genetic differences among units and the capacity of the breeding design and analytical model to account for these differences. Thus, an optimal connectedness measure would allow producers to quantify the proportion of potential bias present in comparisons of predicted breeding values between sets of animals.

METHODS
In order to identify strategies to establish connectedness and reduce potential biases in breeding value predictions between animals from different flocks (or herds), several small scenarios were developed. Each scenario involved two fixed management units with the goal of determining the best strategy to compare breeding values of sires originating within each group ("homebred" sires) and minimize bias in prediction of differences in their breeding values. Obviously, the most accurate comparison would be to progeny test homebred sires from each flock in the same contemporary group. However, producers may be more hesitant to exchange rams of unknown relative merit than to use some agreed-upon linking sire(s) in both units. Therefore, either a common reference sire or a pair of related sires was used to link flocks in a single generation. Each scenario was examined with two heritabilities (0.25, 0.5) and three different proportions of progeny produced from linking sires (50, 33, or 20%). The scenarios were: (1) Reference sire model with three sires (RS3) such that each flock used one homebred sire with a reference sire used in both flocks; (2) Reference sire model with five sires (RS5) such that each flock used two homebred sires (in equal proportion) with, again, a reference sire used in both flocks; (3) Full-sib model (FULL) with four sires where each flock used one homebred sire and one of a pair of full-sibling sires, and with the other full-sibling sire used exclusively in the other flock; (4) Half-sib model (HALF) with four sires, which is the same as FULL except that linking sires were half-sibs.
Although each scenario was compared with 50, 33, or 20% of progeny in each flock born from the linking sires, individual linking sires in the FULL and HALF strategy produced progeny in only one flock and had half as many total progeny across the two flocks as the linking reference sires in RS3 and RS5.
In each scenario, the number of progeny per flock varied from 0 (where all statistics were calculated as though there were no information through progeny records) to 100, inclusive of all possible values in between. To analytically derive expected bias and connectedness statistics, relevant matrices for each scenario were set up using mixed model equations under a sire model. Within each flock, all progeny were assumed to be evaluated in a single contemporary group. Bias associated with predictions of differences in average breeding values of homebred sires was calculated as a percentage using equation (9), adjusted for a sire model. Each of the two flocks was assumed to represent a different genetic group; the true breeding values of homebred sires thus included their flock's genetic group effect. Values for flock genetic group effects were chosen such that the bias when comparing sires across flocks was one unit when no progeny information was available from connecting sires. The reduction in bias with increasing progeny information could thus be expressed as a percentage decrease in bias. Sires were not assigned breeding values since no phenotypic information is required to derive bias in (9). Linking sires came from a third genetic group; empirically, the magnitude of the differences between this third group and the genetic groups of the homebred sires has no effect on bias in comparing homebred sires.
Connectedness statistics for each scenario included the CD (10) of the mean difference in predicted breeding values of homebred sires, the connectedness correlation (r i j ) (11) of these mean differences, and the connectedness rating (CR i j ) (12) of the flock solutions. Homebred sires were unrelated, so PEV(x) was directly proportional to CD and was therefore not calculated. Connectedness measures were plotted against the percentage of remaining bias, as measured by equation (9), with respect to the number of progeny per flock. All these measures have the same range (0 to 1) and could therefore be compared on the same scale.

RESULTS
As expected, the percentage of bias remaining between breeding value predictions for homebred sires decreased at a decreasing rate as the number of progeny per group increased (Fig. 1). Across scenarios, higher heritability was associated with less bias at a fixed number of progeny.
Allocation of more progeny to reference sires also reduced the bias of comparisons between homebred sires (Tab. II). Although reductions in bias were markedly lower with 20% linking sire progeny, differences in bias reductions when 50 vs. 33% of progeny were from linking sires were minor. Reducing the number of progeny from homebred sires would, however, correspondingly increase the PEV of predicted differences in their breeding values. The design that maximizes the overall accuracy of comparison of homebred sires thus depends on the size of genetic differences between flocks.
For a single homebred sire in each flock, use of a common reference sire was more advantageous than use of related sires (Fig. 1). With infinite numbers of progeny, linkages arising from use of a single full-sibling (FULL) or half-sibling (HALF) pair of sires will reduce bias by at most 66.7 and 57.1%, respectively. In contrast, use of a common reference sire will result in eventual complete removal of bias, but only with large numbers of progeny. If the lines from Figure 1 are extended, using the same methodology (Eq. (9)), for scenarios when 50% of progeny are from reference sires (RS3), 540 and 252 progeny per flock are required to reduce bias by 90% at heritabilities of 0.25 and 0.50, respectively. Scenario RS5 required exactly twice as many progeny to reach the same level of bias reduction as RS3. Figure 2 shows the relationship of each of the connectedness measures to the proportion of bias explained for scenarios RS3, FULL, and HALF. The number of progeny per flock was increased in units of 10, with 50% of progeny from linking sires for a trait with heritability 0.25. Within a scenario, connectedness measures were highly correlated (greater than 95%) to the proportional reduction in bias. However, only CD maintained this relationship with bias across scenarios, regardless of the heritability or the proportion of progeny from linking sires in each contemporary group. The same relationship also held for scenario RS5 as long as the CD was based on the mean difference between homebred sires in each flock (rather than the differences between individual homebred sires). The CD was a direct measure of the proportional reduction in bias; one minus CD equaled the proportional amount of the difference in genetic groups that persists as bias in EBV differences. In contrast, values of r i j and CR i j varied depending on the type of connection. In scenario HALF, r i j and CR i j were approximately 60 and 75% less than in RS3 at an equivalent level of bias.

DISCUSSION
In advising producers on methods to link their units in ways that will minimize risk associated with potentially different genetic means among units, our results clearly show that mating at least one-third of breeding females to common linking sires results in nearly optimum levels of bias reduction; allocating one-half of the flock to linking sires is probably not necessary. Results are somewhat sobering, however, in that even with 100 progeny per unit, bias remains at 37.5% or higher for a moderately heritable trait (0.25) with 50% of progeny from a common reference sire. Thus small producers would likely have to maintain the linking process across several years to be successful [18].
The scenarios FULL and HALF performed surprisingly well given the relatively low relationships between linking sires (0.50 and 0.25, respectively). At least part of this result was due to the siblings originating in the same genetic group; this assumption essentially increased their relationship beyond that predicted from their within-flock probability of identity by descent. Linking units through use of sibling sires is an option for very small flocks or herds, but if the total number of progeny is greater than 20 to 30, use of common reference sires has a distinct advantage. Also, when flocks are small, reduction in potential bias between flocks is difficult regardless of the linking strategy. Although the results were not shown, use of multiple sets of siblings as linking sires improved the quality of connections between units. This option may be practical in some larger sheep flocks where AI and sire transport are limiting. A reference sire approach would still be preferable. Multiple reference sires offered no advantage in bias reduction over a single reference sire (other than reducing the risk of an infertile male). Although we considered only sire models, dam relationships and retained female progeny would also enhance connectedness but likely cannot substitute for direct sire linkages.
If producers wish to use multiple homebred sires (i.e., RS3 vs. RS5), the quality of the connections established by linking sires will suffer, and less potential bias in differences between means of homebred sires produced in different units will be explained. This somewhat unexpected result is probably caused by lower numbers of progeny from individual sires. Only half the number of progeny per homebred sire was produced in the RS5 scenarios relative to the RS3 scenarios. Predicted breeding values of individual homebred sires were therefore less accurate resulting in less reduction in bias between animals from the two genetic groups of interest (flocks). Allocating higher proportions of females to linking sires and, if possible, increasing flock size is important when several homebred sires are to be compared among flocks. For a constant number of progeny per homebred sire, this relationship tends to remove effects of flock size on bias reduction since large flocks generally attempt to evaluate more homebred sires. In Figure 1, the bias remaining under the RS5 scenario (two homebred sires per flock) at a flock size of n is the same as that remaining under the RS3 scenario (one homebred sire per flock) at a flock size of 0.5n.
When group effects are not fitted but fixed genetic differences are present, only CD had a consistent relationship with bias reduction across all scenarios tested. These results agree with the theoretical derivation of Laloë and Phocas [21]. The one-to-one relationship of CD with bias reduction is very desirable and relatively easy to explain to producers. Within a scenario, both r i j and CR i j were highly correlated with level of bias and increased monotonically as bias was reduced, but the values of these statistics associated with a given level of bias differed for different linking strategies.
The CD is difficult to calculate for routine genetic evaluation due to storage and processing time required to calculate the inverse of the coefficient matrix and the (non-inverted) relationship matrix, so further development of connectedness measures should focus on measures that are highly correlated to CD. The PEV of a contrast of mean differences can be obtained using matrix absorption [24] and has a strong relationship with CD, and is thus a potential alternative connectedness measure. The connectedness correlation (r i j ) varied proportionally less among scenarios than CR i j and also warrants further consideration to describe within-system changes in connectedness (for instance over a period of years [18]).
Application of these results relies on producers' willingness to connect their flocks or herds in order to take advantage of their combined genetic resources. If producers establish links with other units but do not take advantage of the results, there is no value to establishing connections. Establishing connections takes effort, lowers accuracy (increases PEV) of comparisons of homebred sires, and potentially reduces selection intensity within individual units. However, increasing connectedness can result in several benefits to both the commercial and seedstock portions of the industry. In the short term, commercial and smaller seedstock producers can identify which flocks or herds produce animals with the highest merit, without management practices masking the differences. Overall, the whole industry can make more rapid change as a result. In the long term, if seedstock producers are willing to cooperate under a common breeding objective, they can achieve higher overall gains and overcome any losses in selection accuracy and intensity that may result from their establishing strong connectedness.