Inversion of a part of the numerator relationship matrix using pedigree information
- Pierre Faux^{1}Email author and
- Nicolas Gengler^{1}
https://doi.org/10.1186/1297-9686-45-45
© Faux and Gengler; licensee BioMed Central Ltd. 2013
Received: 25 March 2013
Accepted: 29 October 2013
Published: 6 December 2013
Abstract
Background
In recent theoretical developments, the information available (e.g. genotypes) divides the original population into two groups: animals with this information (selected animals) and animals without this information (excluded animals). These developments require inversion of the part of the pedigree-based numerator relationship matrix that describes the genetic covariance between selected animals (A_{22}). Our main objective was to propose and evaluate methodology that takes advantage of any potential sparsity in the inverse of A_{22} in order to reduce the computing time required for its inversion. This potential sparsity is brought out by searching the pedigree for dependencies between the selected animals. Jointly, we expected distant ancestors to provide relationship ties that increase the density of matrix A_{22} but that their effect on ${\mathbf{A}}_{{}_{22}}^{-1}$
might be minor. This hypothesis was also tested.
Methods
The inverse of A_{22} can be computed from the inverse of the triangular factor (T^{-1}) obtained by Cholesky root-free decomposition of A_{22}. We propose an algorithm that sets up the sparsity pattern of T^{-1} using pedigree information. This algorithm provides positions of the elements of T^{-1} worth to be computed (i.e. different from zero). A recursive computation of ${\mathbf{A}}_{{}_{22}}^{-1}$ is then achieved with or without information on the sparsity pattern and time required for each computation was recorded. For three numbers of selected animals (4000; 8000 and 12 000), A_{22} was computed using different pedigree extractions and the closeness of the resulting ${\mathbf{A}}_{{}_{22}}^{-1}$ to the inverse computed using the fully extracted pedigree was measured by an appropriate norm.
Results
The use of prior information on the sparsity of T^{-1} decreased the computing time for inversion by a factor of 1.73 on average. Computational issues and practical uses of the different algorithms were discussed. Cases involving more than 12 000 selected animals were considered. Inclusion of 10 generations was determined to be sufficient when computing A_{22}.
Conclusions
Depending on the size and structure of the selected sub-population, gains in time to compute ${\mathbf{A}}_{{}_{22}}^{-1}$ are possible and these gains may increase as the number of selected animals increases. Given the sequential nature of most computational steps, the proposed algorithm can benefit from optimization and may be convenient for genomic evaluations.
Keywords
Background
- (1)
a _{ ij } is the numerator relationship coefficient between two animals i and j among n, as defined by Wright [1];
- (2)
diagonal element a _{ ij } is equal to 1 + F _{ i }, where F _{ i } is the inbreeding coefficient [1] of animal i;
- (3)
A is non-singular and symmetric: for two animals i and j among n, a _{ ij } = a _{ ji }.
Because the numerator relationship matrix describes the additive similarity between animals, it is an important element explaining genetic (co)variances between animals and has numerous applications in the field of animal genetics, the most important one being its use in setting up the mixed model equations for estimation of breeding values [2].
- (1)
a sub-population composed of animals called “excluded” hereafter;
- (2)
a sub-population composed of animals called “selected” hereafter.
The four blocks include the relationships between excluded animals (A_{11}), between excluded and selected animals (A_{12} and A_{21}) and between selected animals (A_{22}).
Recent methodological developments in animal breeding require inversion of A_{22}, for example for genotyped animals in the context of genomic prediction using a single-step procedure [3–5]. Another example concerns external animals when integrating foreign information into a local genetic evaluation [6]. It is also noteworthy that the pedigree-based relationship matrix A_{22} and the genomic relationship matrix (G, [7]) show structural similarities: both matrices express polygenic/genomic similarities among animals inherited from ancestors that are not represented in these matrices. Thus, the present research on A_{22} can be extended to genomic relationships in G.
Based on the original work by Henderson [8] on inversion of A, a general framework for the inversion of relationship matrices follows (see Appendix 1). Henderson outlined a method that is based on the root-free factorization of A and showed the high sparsity of the inverse triangular factor of A. An efficient use of this sparsity then allows direct computation of A^{-1} as a sum of individual contributions based on a chronological reading of the pedigree. Applying partitioned matrix theory, van Arendonk et al. [9] gave a general expression for the sum of individual contributions outlined by Henderson [8]: an additional row/column in A leads to updating its inverse by increasing the order by 1 and by summing the square of a very sparse vector to A^{-1}. The very sparse vector is the corresponding row (below the diagonal) of the inverse triangular factor. All details on these developments are given in Appendix 1.
- (1)
inversion of A _{22} with an algorithm that uses the inverse triangular factor;
- (2)
development of an algorithm that uses pedigree information to find the positions of the non-zero elements (sparsity pattern) in the inverse triangular factor of A _{22};
- (3)
inversion of A _{22} with the algorithm of step (1) but restricting computations to the non-zero elements identified by the algorithm in step (2);
- (4)
assessment of the time reduction when computing the inverse as in step (3) instead of as in step (1);
- (5)
and evaluation of the effect of the number of generations in the pedigree used to compute A _{22}, in order to reduce density of the inverse triangular factor.
Methods
Blockwise inversion of A_{22}
where s is a scalar equal to m - y′Z^{- 1}y.
Similarly, as for A (see Appendix 1), there is a link between vector b and the root-free Cholesky factorization of A_{22} (A_{22} = TDT′), in that –b′ corresponds to the last row of the inverse triangular factor of A_{22} (T^{-1}).
Equation (3) shows that ${\mathbf{A}}_{22}^{-1}$ can be constructed recursively by adding a vector product to the previous result (Z^{-1}). This recursive construction of ${\mathbf{A}}_{22}^{-1}$ will be called “Algorithm A” and implies, from the second row to the last row, the computation of the whole vector b.
If an animal and its parents are all selected, vector b is as sparse as in the case of A, i.e. the only non-zero elements of b correspond to parents. Restricting computations to these elements, i.e. discarding computations involving elements that we know equal 0, results in saving computing time. Such a case is, however, highly trivial. In the next sections, we propose a method to deal with more complex cases.
Contribution of selected animals to relationships in A_{22}: characterizing the sparsity pattern of T^{-1}
Matrix A _{ 22 } for the example of Figure 1
C | F | G | I | J | K | L | |
---|---|---|---|---|---|---|---|
C | 1.00 | 0.25 | 0.25 | 0.25 | |||
F | 1.00 | 0.50 | 0.50 | ||||
G | 0.25 | 0.50 | 1.00 | 0.06 | 0.06 | 0.25 | 0.25 |
I | 0.06 | 1.00 | 0.06 | ||||
J | 0.25 | 0.06 | 1.00 | 0.06 | |||
K | 0.50 | 0.25 | 1.00 | ||||
L | 0.25 | 0.25 | 0.06 | 0.06 | 1.00 |
Inverse of the triangular factor (T ^{-1} ) of A _{ 22 } for the example of Figure 1
C | F | G | I | J | K | L | |
---|---|---|---|---|---|---|---|
C | 1.00 | ||||||
F | 1.00 | ||||||
G | -0.25 | -0.50 | 1.00 | ||||
I | 0.02 | 0.05 | -0.09 | 1.00 | |||
J | -0.25 | 1.00 | |||||
K | -0.50 | 1.00 | |||||
L | -0.18 | 0.13 | -0.27 | -0.05 | 1.00 |
- (i)
animal G has two known parents, E and F. Animal E is excluded; its parent C (grandparent of G) is thus required (T ^{- 1} _{ GC } ≠ 0) to explain the relationship between C and G (A _{22;CG} = 0.25).
- (ii)
animal K has one known parent, F, that is also selected. Any relationship that K shares with other selected animals is necessarily and only explained by F ($\forall X\ne F,{\mathbf{T}}_{\mathit{KX}}^{-1}=0$).
- (iii)
animal L has one known parent, E, that is excluded. Its selected halfsib (G) and the selected parent of G (F, which is unrelated to L) are required, among others, to explain any relationship that L shares with other selected animals.
Animals that are required to explain relationships of a given selected animal with other selected animals will hereafter be denoted as the contributors of this selected animal. Contributors of a selected animal can be found by an exhaustive search of selected animals that replace any excluded parent of the selected animal. Their determination uses the pedigree and returns which elements of b (and thereby of T^{-1}) are worth computing because they are expected to be non-zero. By subtraction, we obtain which elements are zeros, which is referred to as the “sparsity pattern” of T^{-1} in the following. In the next sub-section, we propose a heuristic algorithm that streamlines the determination of the sparsity pattern of T^{-1}. Similar methodologies [10, 11] have been developed for the triangular factor of a symmetric-positive definite matrix rather than the inverse of the triangular factor.
An algorithm to set up the sparsity pattern
Our proposed heuristic algorithm to set up the sparsity pattern of the inverse triangular factor of A_{22} (see pseudo-code below) requires two inputs: the pedigree (of length n_{0}, renumbered and ordered: parents precede progeny) and the subpopulation to which any animal belongs: excluded (population status is 1) or selected (population status is 2). The purpose of the algorithm is to complete two vectors of variable length for any animal i. The first vector (r_{(i)}) contains references to excluded parents of animal i. The second vector (c_{(i)}) contains selected contributors of animal i. The positions of non-zeros in the i-th row in T^{-1} (sparsity pattern) includes any position of the i-th row that is listed in c_{(i)}.
Initialize a vector x as the integer sequence from 1 to n_{0}.
- (0)
initialize two vectors c _{(i)} and r _{(i)} as empty vectors
- (1)
if the status of animal i is 2, then append element i to c _{(i)}; or else if the status of animal i is 1, append element i to r _{(i)}
- (2)
if the sire s of animal i is known and its status is 2, then append element s to c _{(i)}; or else if s is known but its status is 1, append vector r _{(s)} to r _{(i)}
- (3)
if the dam d of animal i is known and its status is 2, then append element d to c _{(i)}; or else if d is known but its status is 1, append vector r _{(d)} to r _{(i)}
- (4)if the status of animal i is 2 and the vector r _{(i)} is not empty, then:or else if the status of animal i is 1 or if the vector r _{(i)} is empty, do nothing.
- a.
Select all elements of x that are at positions given in r _{(i)}, remove duplicates and gather them in a temporary list t
- b.
for any element k in list t,
- i.
Append to c _{(i)} the elements of vector c _{(k)} not yet in c _{(i)}
- ii.
Select elements of x that are equal to k and replace them by i;
- a.
If the whole population was selected (i.e. A_{22} = A, every animal has status 2), it can be easily deduced from the algorithm that only the animal itself (in step (1)) and its known sire and dam (in steps (2) and (3)) would enter vector c_{(i)}. The corresponding T^{-1} would be highly sparse, as it is for A. This also means that if numerous parents are selected, then this algorithm is expected to run very fast.
An example of the use of this algorithm is given in the Results section.
Use of the sparsity pattern in blockwise inversion of A_{22}
The algorithm for blockwise inversion of A_{22} (Algorithm A, summarized in equation (3)) is modified to account for sparsity and will be called Algorithm B. For simplicity, we still consider the last selected animal (animal n). Algorithm B reduces computations to obtain b from y = Zb (equations 2 and 3) by three procedures, depending on the number (k) of elements in the corresponding vector c_{(n)} and the length of b (n-1).
The first procedure (called EMPTY) is used when k = 0 (c_{(n)} is empty). If so, only α is added to element ${\mathbf{A}}_{22,\mathit{nn}}^{-1}$. The value of α is just the inverse of A_{22,nn}.
Since (n-1) is the number of elements in b and k the number of elements in b_{ u }, k dot products (of (n-1)-long vectors) would be performed instead of (n-1) dot products (of (n-1)-long vectors).
Vector b_{ u } can be expressed in terms of b_{ v } (${\mathbf{b}}_{u}={\mathbf{Z}}_{\mathit{uu}}^{-1}{\mathbf{y}}_{u}-{\mathbf{Z}}_{\mathit{uu}}^{-1}{\mathbf{Z}}_{\mathit{uv}}{\mathbf{b}}_{v}$) and, since b_{ v } is a vector of zeros, it comes that computing b_{ u } shrinks to compute only ${\mathbf{Z}}_{\mathit{uu}}^{-1}{\mathbf{y}}_{u}$. In other words, the linear system Zb = y is replaced by a linear system of lower size Z_{ uu }b_{ u } = y_{ u }, and solving it is valuable only if the number of operations required to solve it is lower than the number of operations to achieve the product in procedure PROD. We chose the less expensive procedure (PROD or LS) by estimation of the number of expected floating-point multiplications.
Experimental design for tests on real populations
Statistics of the population used (dairy cows from Luxembourg)
Total number of animals | 387 499 |
Number of cows | 366 773 |
Number of bulls | 20 726 |
Number of animals by birth year class: | |
Before 1950 | 5441 |
From 1950 to 1974 | 24 577 |
From 1975 to 1999 | 229 016 |
From 2000 to 2012 | 128 465 |
Maximum number of generations of pedigree | 39 |
Average number of generations ^{ 1 } for animals in different birth year classes: | |
Before 1950 | 3.28 |
From 1950 to 1974 | 6.49 |
From 1975 to 1999 | 19.03 |
From 2000 to 2012 | 25.11 |
Pedigree completeness: number of animals with (% of the pedigree): | |
Both parents unknown: | 70 167 (18.1%) |
Dam known, sire unknown | 69 721 (18.0%) |
Sire known, dam unknown | 17 141 (4.4%) |
Both parents known | 230 470 (59.5%) |
Selected sub-populations of three sizes (4000, 8000 and 12 000 animals) were designed and are identified hereafter as the three size scenarios S4k, S8k and S12k. Animals of the selected sub-populations were randomly chosen from a pool of animals born after 1999 (128 465 animals) on the assumption that only recent animals could be of interest (those being genotyped or in production).
Deviations from the real inverse were measured by the following norm: $N=\raisebox{1ex}{$\mathit{tr}\left({\left({\mathbf{A}}_{22}^{\left(g\right)}-{\mathbf{A}}_{22}^{\left(f\right)}\right)}^{\prime}\left({\mathbf{A}}_{22}^{\left(g\right)}-{\mathbf{A}}_{22}^{\left(f\right)}\right)\right)$}\!\left/ \!\raisebox{-1ex}{$\mathit{tr}\left({\left({\mathbf{A}}_{22}^{\left(f\right)}\right)}^{\prime}{\mathbf{A}}_{22}^{\left(f\right)}\right)$}\right.$, where ${\mathbf{A}}_{22}^{\left(g\right)}$ is the inverse of A_{22} computed using g extracted generations and ${\mathbf{A}}_{22}^{\left(f\right)}$ is the real inverse. This norm can be interpreted as the average difference between the value of any element of ${\mathbf{A}}_{22}^{\left(g\right)}$ and its corresponding value in ${\mathbf{A}}_{22}^{\left(f\right)}$. The two matrices are equal when N is equal to 0.
Matrix A_{22} was computed in two steps. Inbreeding coefficients were first computed for each size scenario and number of extracted generations. The average inbreeding coefficient was never greater than 1.23 % and the greatest inbreeding coefficient was 44.53%. Matrix A_{22} was then computed using the method of Colleau [12].
Two test software programs
In order to evaluate potential gains in time when using Algorithm B instead of Algorithm A to invert A_{22}, we developed two test programs in Fortran 95. The programs were neither optimized for speed, nor parallelized. Therefore, all comparisons have to be interpreted as relative figures.
The first program applies the recursive construction of the inverse, as outlined in Algorithm A (equations (2) and (3)). Potential null entries in y are checked to avoid useless computations when performing product Z^{-1}y.
The second program restricts the same recursive construction of the inverse to non-zero elements by procedures EMPTY, PROD and LS. Potential null entries in y are also taken into account when performing the product Z^{ u }y (procedure PROD). The linear system Z_{ uu }b_{ u } = y_{ u } (procedure LS) is solved by factorization and by backward and forward substitutions.
For both programs, computing time was recorded using Fortran intrinsic subroutine CPU_TIME. For the program that uses Algorithm B, computing time includes the time required to determine the sparsity pattern. All computations and file storage were performed using double precision (15 digits). Each job was repeated 20 times on an Intel® Xeon® 64-bit processor (RAM: 8 Gb, cache size: 6 Mb, clock speed: 3 GHz).
Results
Characterizing the sparsity pattern: a numerical example
Renumbered pedigree for the example of Figure 1
Number | Sire | Dam | Status | |
---|---|---|---|---|
A | 1 | - | - | 1 |
B | 2 | - | - | 1 |
C | 3 | 1 | 2 | 2 |
D | 4 | - | - | 1 |
E | 5 | 3 | 4 | 1 |
F | 6 | - | - | 2 |
G | 7 | 6 | 5 | 2 |
H | 8 | - | 4 | 1 |
I | 9 | 8 | - | 2 |
J | 10 | 1 | - | 2 |
K | 11 | 6 | - | 2 |
L | 12 | - | 5 | 2 |
Population status of the animal is given in a 4th column: 1 for excluded, 2 for selected.
Animal 1. Status 1 and unknown parents. Thus, r_{(1)} = [1], c_{(1)} = [-] and x = x.
Animal 2. Status 1 and unknown parents. Thus, r_{(2)} = [2], c_{(2)} = [-] and x = x.
Animal 3. Status 2 and known parents (1 and 2; both status 1). Thus, c_{(3)} = [3] and r_{(3)} = [1, 2]. The list of elements of x that match r_{(3)} is [1, 2]. Then, c_{(3)} = [3, c_{(1)}, c_{(2)}] = [3] and any element of x equal to 1 or 2 is replaced by 3, returning x = [3, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12].
Animal 4. Status 1 and unknown parents. Thus, r_{(4)} = [4], c_{(4)} = [-] and x = x.
Animal 5. Status 1 and known parents (status 1 and 2). Thus, c_{(5)} = [3] and r_{(5)} = [5, r_{(4)}] = [5, 4]. No list to set up because animal has status 1; x = x.
Animal 6. Status 2 and unknown parents. Thus, r_{(6)} = [-], c_{(6)} = [6] and x = x.
Animal 7. Status 2 and known parents (status 1 and 2). Thus, c_{(7)} = [7, 6] and r_{(7)} = [r_{(5)}] = [5, 4]. The list of elements of x that match r_{(7)} is [4, 5]. Then, c_{(7)} = [7, 6, c_{(5)}, c_{(4)}] = [7, 6, 3] and any element of x equal to 5 or 4 is replaced by 7, returning x = [3, 3, 3, 7, 7, 6, 7, 8, 9, 10, 11, 12].
Animal 8. Status 1 and one known parent (status 1). Thus, r_{(8)} = [8, r_{(4)}] = [8, 4], c_{(8)} = [-] and x = x.
Animal 9. Status 2 and one known parent (status 1). Thus, c_{(9)} = [9] and r_{(9)} = [r_{(8)}] = [8, 4]. The list of elements of x that match r_{(9)} is [7, 8]. Then, c_{(9)} = [9, c_{(8)}, c_{(7)}] = [9, 7, 6, 3] and any element of x equal to 8 or 7 is replaced by 9, returning x = [3, 3, 3, 9, 9, 6, 9, 9, 9, 10, 11, 12]
Animal 10. Status 2 and one known parent (status 1). Thus, c_{(10)} = [10] and r_{(10)} = [r_{(1)}] = [1]. The list of elements of x that match r_{(10)} is [3]. Then, c_{(10)} = [10, c_{(3)}] = [10, 3] and any element of x equal to 3 is replaced by 10, returning x = [10, 10, 10, 9, 9, 6, 9, 9, 9, 10, 11, 12]
Animal 11. Animal has status 2 and has one known parent (status 2). Thus, c_{(11)} = [11, 6] and r_{(11)} = [-]. No list to set up because r_{(11)} is empty; x = x
Animal 12. Status 2 and one known parent (status 1). Thus, c_{(12)} = [12] and r_{(12)} = [r_{(5)}] = [5, 4]. The list of elements of x that match r_{(12)} is [9]. Then, c_{(12)} = [12, c_{(9)}] = [12, 9, 7, 6, 3] and any element of x equal to 9 is replaced by 12, returning x = [10, 10, 10, 12, 12, 12, 12, 12, 12, 10, 11, 12]
Effect of accounting for sparsity on CPU time for inversion of A_{22}
Effect of the number of extracted generations on accuracy of${\mathbf{A}}_{22}^{-1}$
Discussion
Computation time required by the algorithm to characterize the sparsity pattern
For the fully extracted population (after 23, 24 and 24 generations for scenarios S4k, S8k and S12k, respectively), there was a close-to-linear relationship between the size of the selected population and the elapsed CPU time (approximately 6 seconds for 4000 additional animals in the selected sub-population). The effective computational complexity of this algorithm is difficult to establish, however, because it mostly depends, first, on how the population was split (for instance, a selected sub-population that includes mainly a few lines or families would not contain that many excluded parents) and, secondly, on how the population is structured (depth of the pedigree, effective size of the population, average inbreeding). The embedded loop in the algorithm (step (4b) in the pseudo-code) is the main computational bottleneck and performs k iterations. In a population of n_{0} animals, if k is related to the two factors mentioned above (i.e. splitting and structure of the population), then the computing time required by the algorithm would behave as n_{0} · k, where k would be a case-specific factor. This agrees with the observations in Figure 5.
Memory requirements of the algorithm to characterize the sparsity pattern
For a population of n_{0} animals with n selected animals, vectors c_{(i)} and y_{(i)} have the greatest RAM requirements. In our implementation, vector y_{(i)} stores few elements (positions of excluded ancestors) for all animals (thus ~ n_{0} integers). For selected animals, vector c_{(i)} stores non-zero positions and includes approximately $n\xb7\left(n\xb7\overline{d}\right)$ integers, where $\overline{d}$ is the average density of T^{-1} (number of non-zeros in the lower part of T^{-1} averaged by line). For excluded animals, c_{(i)} accounts for potential selected ancestors, therefore including approximately $\left({n}_{0}-n\right)\xb7\overline{a}$ integers, where $\overline{a}$ is the average number of selected ancestors per excluded animal. Memory would thus be allocated for approximately ${n}^{2}\overline{d}+\left({n}_{0}-n\right)\xb7\overline{a}$ integers. None of these integers may be declared as 3-byte integers when n_{0} is lower than 2^{24} (i.e. when pedigree contains less than 16.77 millions of animals).
Use of the algorithm to characterize the sparsity pattern on greater populations
Populations extracted for different sets of selected animals
Number of selected animals | Size of the extracted population | Proportion of selected animals in the extracted population (%) |
---|---|---|
4 000 | 40 196 | 9.95 |
8 000 | 59 120 | 13.53 |
12 000 | 73 864 | 16.25 |
16 000 | 87 237 | 18.34 |
32 000 | 127 809 | 25.04 |
48 000 | 159 259 | 30.14 |
64 000 | 183 750 | 34.83 |
80 000 | 204 637 | 39.09 |
96 000 | 222 546 | 43.14 |
112 000 | 238 130 | 47.03 |
128 000 | 252 147 | 50.76 |
Computation time required by the algorithm for inversion of A_{22} using the sparsity pattern (Algorithm B)
Estimated computational complexity ^{ 1 } of Algorithm B
Procedure | Complexity for line i | Proportion | Complexity on n lines |
---|---|---|---|
EMPTY | 1 | p _{E} | p_{ E } · O(n) |
LS | O(k^{3}) + O(k^{2}) + O(k) | p _{L} | p_{ P } · [O(n · k^{3}) + O(n · k^{2}) + O(n · k)] |
PROD | O(k^{2}) + O(k · i) | p _{P} | p_{ P } · [O(n · k^{2}) + O(n^{2} · k)] |
Total | $\begin{array}{l}O\left({k}^{3}\right)+O\left({k}^{2}\right)\\ +O\left(k\xb7i\right)+O\left(k\right)\end{array}$ | 1 | $\begin{array}{l}{p}_{P}\xb7O\left({n}^{2}\xb7k\right)+{p}_{L}\xb7O(n\xb7{k}^{3})+({p}_{L}+{p}_{P})\xb7O(n\xb7{k}^{2})\\ +{p}_{L}\xb7O\left(n\xb7k\right)+{p}_{E}\xb7O\left(n\right)\end{array}$ |
The issue of numerical stability was also addressed. When using procedure PROD, the result of the previous iteration was used in the current iteration through α and b. Accumulating errors could lead to instabilities and/or divergences. However, in LS procedure, the result of the previous iteration does not affect the b that is computed. Choosing the LS procedure at regular intervals among iterations using the PROD procedure (see Figure 8) stops the accumulation of errors that could have resulted from continuously choosing the PROD procedure. Therefore, interlacing choices for both procedures is a good way to prevent numerical instability. Independence between iterations also allows procedure LS to parallelized.
Memory requirements of the algorithm for inversion of A_{22} using sparsity pattern (Algorithm B)
Algorithm B requires allocation of more than twice the RAM than Algorithm A because it cannot store the results of the inversion in the input matrix. This is due to procedure LS working on different parts of A_{22}. However, since elements that are required for LS are identified when determining the sparsity pattern, they could be stored separately in order to reduce the amount of RAM required. For that reason, sparsity patterns should be established prior to computation of A_{22} to determine which relationships are worth being computed.
Number of generations to extract
The depth of the pedigree to be used for instance in genetic evaluations, is still a question of debate, and often moderately deep pedigrees are used, especially when only recent data is analyzed.
Results in Figure 4 suggest that pedigree from a limited number of generations (5 to 10) is sufficient to compute ${\mathbf{A}}_{22}^{-1}$ with reasonable accuracy. The explanation is that distant ancestors do not greatly enhance a relationship. For instance, a common ancestor to animals i and j that enters the pedigree after g extracted generations and that is older than any selected animal, can only add up to 2^{- 2g} to the value of the relationship between i and j. In generation g, i and j can have a maximum of 2^{ g } common ancestors. Therefore, extracting an additional generation can increase the relationship between i and j by only up to δ = 2^{- g}. Regardless of the number of animals added to the pedigree when extracting generation 10, the maximum change brought to any relationship reduces to less than 0.001, which would have a minor effect on the inverse scale, as confirmed by Figure 4.
However, computing time required for determination of the sparsity pattern increases linearly after 10 generations (Figure 5). Thus, limiting extraction of pedigree to 10 generations appears to be a good balance between taking into account relationships due to distant ancestors and computing time. Applying a similar study to pedigree extractions for routine genetic evaluations would be meaningful and may lead us to consider extracting a number of generations instead of a birth year limit, which is current common practice.
Practical use in a genomic background
- (1)
Animals have to be listed by generation order and only animals younger than those already genotyped can be added because older animals may cause changes in the sparsity pattern. This could be easily implemented in a cattle breed such as Holstein, where only few animals are key ancestors of the breed.
- (2)
The resulting file may be large but this could be reduced by sparse storage approaches.
Meyer et al. [13] recently applied a similar methodology for computation of the inverse of the genomic relationship matrix (G): their methodology also updates the previous inverse of G, necessitating its storage on disk from an evaluation to the next one.
Secondly, when using a pedigree of only one extracted generation, which contains genotyped animals and their ungenotyped parents, inversion of A_{22} is even faster (Figure 6) and the inverse seems to be a reasonable approximation of ${\mathbf{A}}_{22}^{-1}$ computed with a full extracted pedigree (see Figure 4 and Discussion here above). Such a fair approximation of ${\mathbf{A}}_{22}^{-1}$ may be useful as a preconditioner to solve A_{22}x = v, for instance, as required in the iterative solution of MME of single-step genomic BLUP (best linear unbiased prediction) proposed by Legarra and Ducrocq [14].
Current limits
The algorithm to determine the sparsity pattern of the inverse triangular factor of A_{22} is obviously useful only in inversion algorithms that use the inverse triangular factor. For other inversion algorithms, the algorithm to determine the sparsity pattern should not be useful.
Inversion algorithms that use the inverse triangular factor are useful in certain cases (e.g., for updating an inverted matrix or for obtaining quick approximations), but they would be less efficient, in terms of computing time, for the single purpose of inversion. The time required by Algorithms A and B was compared with the time required by subroutine “dkmxhf.f90” (K. Meyer, University of New England, Australia), which is a regular and efficient inversion algorithm. For inversion of the three different orders of A_{22} (4000, 8000 and 12 000), computing times of dkmxhf.f90 were lower than computing times obtained with Algorithm A and similar to those obtained with Algorithm B (accounting for sparsity). For small numbers of extracted generations, computing times were slightly lower for Algorithm B than dkmxhf.f90, but were greater when greater numbers of generations were extracted. However, the computing speed of Algorithm B can benefit from several optimizations (e.g., parallelization of the LS procedure and use of specific libraries for matrix products).
For computational ease, a small population (less than 1 million animals) was used in this study. Gains in computing time have to be tested for other sizes of population. This study was also restricted to only one population by size scenario and used repetitions (20) of the algorithm on the same data. Use of a Holstein population may also be criticized because although the average computed inbreeding was never greater than 1.23%, such a population has few key ancestors. Having the key ancestors in the selected sub-population might avoid density, because they would be contributors of many other selected animals.
Conclusions
The determination of the sparsity pattern of T^{-1} using pedigree information is a prior step that allows gains in computing time for inversion based on the use of T^{-1}. This allowed the computing time for inversion of matrices of three different sizes (4000, 8000 and 12 000 selected animals) to be reduced by a factor 1.73 on average. Gains in computing time are expected to be higher if the number of selected animals exceeds 80 000. Memory requirements for inversion of such a matrix would increase and the algorithm would become numerically more stable, since the LS procedure would become more important than the PROD procedure. Moreover, computation of the inverse by a recursive method may be very helpful in the case of genomic prediction, where a new batch of younger selected animals at each upcoming evaluation must be added to the previous inverse matrix already computed.
The results on the number of pedigree generations required for the selected animals suggest that no more than 14 generations should be extracted. If the working precision is less than 15 digits, this can even be reduced. A good balance between computing time for determination of the sparsity pattern and accuracy may be achieved with 10 extracted generations.
Appendix
Appendix 1: Inversion of the numerator relationship matrix using the inverse triangular factor
- (1)
the positions and values of non-zero elements in b _{(i)}, i.e. the structure of T ^{-1};
- (2)
some elements of the original matrix, to compute d _{ ii } as ${a}_{\mathit{ii}}-{\mathbf{b}}_{\left(i\right)}^{\mathbf{\prime}}{\mathbf{A}}_{\left(i-1\right)}{\mathbf{b}}_{\left(i\right)}$.
After meeting these requirements (determination of the structure of the inverse triangular factor and computation of some elements of the original matrix), the same framework was extended to the inversion of other relationship matrices used in animal breeding (e.g. gametic relationship matrix [17], dominance [18] and epistasis [19] effects or covariance matrix of marked QTL effects [20]).
Declarations
Acknowledgements
Luxembourg breeders association CONVIS is gratefully acknowledged for providing the pedigree data used in this study. PF acknowledges the following peoples and institution: Fonds National de la Recherche Luxembourg (FNR) for funding through AFR grant, I Misztal for hosting at Animal and Dairy Sciences Department of University of Georgia and for providing computing facilities, G Gorjanc for hosting in Animal Science Department of University of Ljubljana, J Vandenplas for helpful discussions, C Charles and L Lecharlier for mathematical advising, C Bastin, JFDumasy and D. TeCroney for editing help, and several members of the Numerical Genetics, Genomics and Modeling group of Gembloux Agro-Bio Tech for testing the algorithm. Comments and edits by two anonymous reviewers and by editors are gratefully acknowledged. The authors acknowledge the financial support of the Ministry of Agriculture of the Walloon Region of Belgium through the project “DairySNP” (D31-1274/S1 and D31-1274/S2) and the collaboration with Walloon Breeding Association.
Authors’ Affiliations
References
- Wright S: Coefficients of inbreeding and relationship. American Nat. 1922, 56: 330-338. 10.1086/279872.View ArticleGoogle Scholar
- Henderson CR: Sire evaluation and genetic trends. Proceedings of the Animal Breeding and Genetics Symposium in Honour of JL Lush. Edited by: American Society of Animal Science and American Dairy Science Association Champaign. 1973, Champaign, IL: , 10-41.Google Scholar
- Misztal I, Legarra A, Aguilar I: Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information. J Dairy Sci. 2009, 92: 4648-4655. 10.3168/jds.2009-2064.View ArticlePubMedGoogle Scholar
- Christensen OF, Lund MS: Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010, 42: 2-10.1186/1297-9686-42-2.PubMed CentralView ArticlePubMedGoogle Scholar
- Gengler N, Nieuwhof G, Konstantinov K, Goddard M: Alternative single-step type genomic prediction equations. Book of Abstracts of the 63rd Annual Meeting of the European Association of Animal Production: 27-31 August 2012. 2012, Bratislava Wageningen: Wageningen Academic Publishers, 131-Google Scholar
- Vandenplas J, Gengler N: Comparisons and improvements of different Bayesian procedures to integrate external information into genetic evaluations. J Dairy Sci. 2012, 95: 1513-1526. 10.3168/jds.2011-4322.View ArticlePubMedGoogle Scholar
- VanRaden PM: Efficient methods to compute genomic predictions. J Dairy Sci. 2008, 91: 4414-4423. 10.3168/jds.2007-0980.View ArticlePubMedGoogle Scholar
- Henderson CR: A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics. 1976, 32: 69-83. 10.2307/2529339.View ArticleGoogle Scholar
- van Arendonk JAV, Tier B, Kinghorn BP: Use of multiple genetic markers in prediction of breeding values. Genetics. 1994, 137: 319-329.PubMed CentralPubMedGoogle Scholar
- Gilbert JR: Predicting structure in sparse matrix computations. SIAM J Matrix Anal Appl. 1994, 15: 62-79. 10.1137/S0895479887139455.View ArticleGoogle Scholar
- George A, Liu JWH: An optimal algorithm for symbolic factorization of symmetric matrices. SIAM J Comput. 1980, 9: 583-593. 10.1137/0209044.View ArticleGoogle Scholar
- Colleau JJ: An indirect approach to the extensive calculation of relationship coefficients. Genet Sel Evol. 2002, 34: 409-421. 10.1186/1297-9686-34-4-409.PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer K, Tier B, Graser HU: Technical note: updating the inverse of the genomic relationship matrix. J Anim Sci. 2013, 91: 2583-2586. 10.2527/jas.2012-6056.View ArticlePubMedGoogle Scholar
- Legarra A, Ducrocq V: Computational strategies for national integration of phenotypic, genomic and pedigree data in a single-step best linear unbiased prediction. J Dairy Sci. 2012, 95: 4629-4645. 10.3168/jds.2011-4982.View ArticlePubMedGoogle Scholar
- Quaas RL: Computing the diagonal elements and inverse of a large numerator relationship matrix. Biometrics. 1976, 32: 949-953. 10.2307/2529279.View ArticleGoogle Scholar
- Emik LO, Terril CE: Systematic procedures for calculating inbreeding coefficients. J Hered. 1949, 40: 51-55.PubMedGoogle Scholar
- Schaeffer LR, Kennedy BW, Gibson JP: The inverse of the gametic relationship matrix. J Dairy Sci. 1989, 72: 1266-1272. 10.3168/jds.S0022-0302(89)79231-6.View ArticleGoogle Scholar
- Hoeschele I, VanRaden PM: Rapid inversion of dominance relationship matrices for noninbred populations by including sire by dam subclass effects. J Dairy Sci. 1991, 74: 557-569. 10.3168/jds.S0022-0302(91)78203-9.View ArticlePubMedGoogle Scholar
- VanRaden PM, Hoeschele I: Rapid inversion of additive by additive relationship matrices by including sire-dam combination effects. J Dairy Sci. 1991, 74: 570-579. 10.3168/jds.S0022-0302(91)78204-0.View ArticlePubMedGoogle Scholar
- Fernando RL, Grossman M: Marker assisted selection using best linear unbiased prediction. Genet Sel Evol. 1989, 21: 467-477. 10.1186/1297-9686-21-4-467.PubMed CentralView ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.