A meta-analytic assessment of a Thyroglobulin marker for marbling in beef cattle

A meta-analysis was undertaken reporting on the association between a polymorphism in the Thyroglobulin gene (TG5) and marbling in beef cattle. A Bayesian hierarchical model was adopted, with alternative representations assessed through sensitivity analysis. Based on the overall posterior means and posterior probabilities, there is substantial support for an additive association between the TG5 marker and marbling. The marker effect was also assessed across various breed groups, with each group displaying a high probability of positive association between the T allele and marbling. The WinBUGS program code used to simulate the model is included as an Appendix available online at .


INTRODUCTION
Marbling is the fat that is deposited between individual muscle fibres of the M. longissimus dorsi. Marbling and the distribution of intramuscular fat are economically important factors with respect to beef quality. A polymorphism in the 5' promoter region of the bovine Thyroglobulin gene (TG5) has been reported to be associated with variation in marbling [3]. The polymorphism is a C/T transition whereby cattle that are either homozygous or heterozygous for the Thymine (T) allele (i.e. TT or CT genotypes) appear to have higher marbling scores than cattle that are homozygous for the Cytosine (C) allele (i.e. CC genotypes).
A search of published and available unpublished literature revealed 14 independent studies that provide 19 estimates of the association between TG5 and marbling. The study-specific results generally support an association, but not all are individually convincing. Each study was based on a relatively small number of animals and four different measurements of marbling were used. It is then of interest to assess whether the accumulation of evidence provides stronger support for this association.
Meta-analysis provides a means of statistically combining study results [8]. A meta-analysis of comparable study estimates was undertaken using a Bayesian hierarchical model. Although meta-analysis has been used in human genetics [1,6,7,17], less attention has been paid to meta-analysis in the context of livestock genomics. Goffinet and Gerber [12] and Khatkar et al. [13] have presented methods for combining QTL results from independent studies, based on a modified Akaike criterion.
The dataset used for the meta-analysis is described in Section 2.1 and the statistical model is detailed in Section 2.2. Computational issues are addressed in Section 2.3 and a sensitivity analysis is described in Section 2.4. Results are reported in Section 3 and discussed in Section 5. Studies which could not be included in the meta-analysis are described in Section 4.

Dataset for meta-analysis
A literature review and communication with research groups identified 14 studies which provide estimates of the association between TG5 and marbling. Results in most studies were either directly reported as, or were able to be calculated as contrasts with corresponding standard errors. Therefore this was the measure used in the meta-analysis. Details of the individual studies are summarised in Table I. Three studies reported measures of association which could not be represented as contrasts and so were excluded from the metaanalysis. These studies are discussed separately in Section 4. Study 1 was undertaken using a subgroup of the Angus cattle of a large dataset described in detail in [19]. The data for Study 6 was first analysed in [4]. The excluded studies were described in [22] and [5]. All other studies were unpublished technical reports provided by Genetic Solutions Pty. Ltd.
Each study assessed the degree of marbling using one of four methods. Three methods (AUS-MEAT, MSA, USDA) involved trained assessors scoring the chilled carcass against specific grading standards [21]. AUS-MEAT marble scores range from 0 to 6 in steps of 1 [2], Meat Standards Australia (MSA) marble scores range from 0.0 to 6.9 in increments of 0.1 [16] and United States Department of Agriculture (USDA) marble scores range from 100 to 1000 in increments of 10 [14]. The fourth method involves physically measuring the percentage of intramuscular fat (IMF) [18]. These four measurements of marbling are hereafter called traits. For each trait, a higher score indicates more marbling.
The studies included a variety of breeds, as listed in Table I. For the purposes of the meta-analysis, four distinct breed groups were defined: British/European, Wagyu Cross, and two Composites. This classification groups breeds with similar genetic origin and TG5 allele frequencies.
The composites were divided into two breed groups because the Alexandria Composite animals (study 4) were all offspring of heterozygous sires.
Two experimental herds (studies 1 and 4) provided the basis for seven of the 14 estimates, with different subsets of animals used to measure each trait. Because of the lack of specific information about the overlap or composition of these subsets, it was not feasible to incorporate corresponding covariances into the model.
For the purposes of the meta-analysis, phenotypic data were standardized to residual standard deviation units for each study. The reported estimates comprised genotypic (effects and standard error estimates) and phenotypic information (marble scores and IMF). In order to combine the estimates in the meta-analysis, the results were represented as deviations of the observed CC and TT effects from the heterozygous (CT) effect, centred around zero. If the T allele is positively associated with marbling, we would expect to see a negative effect for CC versus CT and a positive effect for TT versus CT and for TT versus CC. The standardised data from all 11 studies used in the meta-analysis is summarised in Table I.

Meta-analysis model
A Bayesian hierarchical model was adopted for the meta-analysis. Separate models were fit for the effects CC/CT, TT/CT and TT/CC.
The effects are each taken to be normally distributed with corresponding parameters (µ c ,ξ c ), where the subscript c can be sb, b or 0, meaning that the parameter describes the study within breed, breed or overall distribution respectively. ξ = 1/σ 2 denotes the precision. Estimates s isb of the standard error σ isb are given in Table I.
In the absence of information about the covariance structure between studyand breed-specific estimates, the observed and prior precision matrices at each level of the hierarchy were assumed to be diagonal. The consequences of this assumption are discussed in Section 5. Each ξ c is assumed to have a chi-square distribution with degrees of freedom ν broadly set in line with the corresponding sample sizes. So for the effects CC/CT, TT/CT and TT/CC, the respective values are ν sb = (200, 100, 130), ν b = 10, ν 0 = 3.
The study-specific priors were adjusted to account for the informativeness of each study population. Assuming an additive gene action (no dominance), prior weights ω s were computed for each study as the ratio of the genetic variance to that of a maximally informative population (one in which the C and T alleles have the same frequency, i.e. p T = 0.5) [9]. Each study population was assumed to be in Hardy-Weinberg equilibrium. This yielded a studyspecific weight vector ω s = (0.751, 0.991, 0.923, 0.891, 0.875, 0.613, 0.805, 0.967, 0.216, 0.998, 0.911) . In principle, similar weights could be used with the priors for the breed group effects, but in the absence of additional information on the breed groups, all breed group weights ω b were set to 1. To preserve scale at each level of the hierarchical model, the weights at the study and breed levels were multiplied by α = 100.
The full model is thus represented as follows: Figure 1 illustrates the structure of this hierarchical model.

Computation
The analysis was performed using Markov chain Monte Carlo (MCMC) through the Bayesian computation software WinBUGS [15,20]. The model code is listed in the Appendix for the main model which assumes additive gene action (this code is available online at www.edpsciences.org/gse). One hundred thousand iterations were dismissed as burn-in and the following two hundred thousand iterations were used for parameter estimation. Satisfactory convergence of the simulated Markov chains to the target posterior distributions was assessed using the diagnostics in WinBUGS.

Sensitivity analysis
A sensitivity analysis was performed to investigate the impact of the assumptions made in developing the model described in Section 2.2.
The first issue was the choice of prior distributions. In the absence of other distributional information, the adoption of normal prior distributions for the location parameters appears satisfactory. However, the use of chi-squared or more general gamma distributions to describe scale parameters has recently been called into question [10]. A suggested alternative is to use uniform priors on the standard deviations. In order to assess the impact of this particular choice, the model was re-analysed, replacing the chi-squared priors on the precisions with over-dispersed but proper uniform distributions on the standard deviations.
Another issue in prior modelling was the effect of using prior weights on studies which depend on the animals' genetic variance. To assess the effect of ignoring the genetic variance within studies, these weights were all made equal to unity.
In addition to the additive and recessive models, a third model was considered that assumes the T allele is dominant in its effect on marbling.
As described above, the AUS-MEAT, MSA and USDA marbling scores are all based on visual judgement of the amount of marbling present in the carcass, whereas the IMF score is based on a physical analysis. For this reason, the meta-analysis was repeated without the IMF-based estimates (numbered 13 and 14 in Tab The impact of small numbers of TT animals in some studies was also considered. Estimates 7 and 12 (see Tab. I) are based on just five and three TT animals, respectively. The meta-analysis was repeated without these two estimates. Estimate 12 was the only estimate from study 9 and the Composite 2 breed group, so these were omitted from the analysis. The study-specific weight vector was consequently recalculated as ω s = (0.75, 0.991, 0.923, 0.891, 0.875, 0.613, 0.805, 0.967, 0.998, 0.911) .
Finally, the goodness of fit of the meta-analysis model was assessed using posterior predictive checks. Following Gelman et al. [11], the minimum, maximum, mean and standard deviation of the 16 effect estimates were compared against the model posterior densities of the same statistics. The model is asserted to be adequate if the observed statistic is included in the body of the corresponding posterior predictive distribution.

RESULTS
As described in Section 2, for the purposes of comparability the CC/CT and TT/CT estimates are represented as zero-centred deviations of the homozygous (CC and TT) effects from the heterozygous (CT) effect. The posterior distributions of these effects for the additive case are depicted in Figure 2.
Posterior means, standard deviations and 95% credible intervals for the breed-specific and overall effects are shown in Table II for the additive case. The posterior means (s.d.) for the overall CC/CT, TT/CT and TT/CC effects were -0.117 (0.079), 0.091 (0.093) and 0.198 (0.100), respectively. The posterior probability that the overall CC/CT effect is less than zero was 0.935, and that the overall TT/CT and TT/CC effects are greater than zero were 0.854 and 0.973, respectively.
The Wagyu cross (WC) breed group shows the greatest association between TG5 and marbling, giving a posterior probability of association of 0.998 for the TT/CC effect. A high degree of association was also found for the The shrinkage of estimates at each hierarchy of the model is depicted in Figure 3, for each of the effects CC/CT, TT/CT and TT/CC. Figure 3 also shows the posterior 95% credible interval for the overall effect (µ 0 ).
The results were very similar when a recessive model was assumed (see Tab. III). Since the overall posterior probability of an association was similar (ranging from 0.85 to 0.94) for both the CC/CT and TT/CT contrasts under both models, we have strong evidence that the gene effect of the T allele is additive.
As described in Section 2.4, the influence of various modelling decisions was assessed by analysing alternative models. Table IV shows the influence of this sensitivity analysis on the overall posterior mean, standard deviation, 95% credible interval and probability of positive association between marbling and the number of copies of the T allele for each breed group and overall. Comparison with the analogous values in Tables II and III shows that the alternative sensitivity models gave very similar results and that the largest change to any of the overall posterior probabilities of association was 6%.
The choice of distributional form for the priors on the scale parameters was not found to be influential since the alternative uniform priors induced negligible change in the posterior estimates of effects and precisions.
Finally, the overall goodness of fit of the model was assessed. As demonstrated in Figure 4, the overall goodness of fit of the model was satisfactory. For the CC/CT, TT/CT and TT/CC contrasts, the posterior predictive distributions of the four test statistics all included the observed value of the statistic (indicated by a vertical line) in areas of reasonable probability.

OTHER STUDIES
Three studies reported the association between TG5 and marbling as least squares means. For reasons of comparability these were not included in this meta-analysis and are instead summarised and discussed below.
Thaller et al. [22] investigated the association between TG5 and IMF in 28 German Holsteins and separately in 27 Charolais animals. They reported significantly higher IMF values for TT genotypes against CC genotypes in German Holsteins. Their results suggested a recessive effect of the T allele on marbling, also found by Barendse et al. [4], but were based on small studies with just three German Holsteins and one Charolais having TT genotype.
Casas et al. [5] investigated the association between the TG5 marker and USDA marble score in a sample of 467 Brahman (Bos indicus) cattle. They found no association, but the sample included only 18 CT and 7 TT animals.

DISCUSSION
The meta-analysis of eleven independent association studies provides increased support for an association between the TG5 marker and marbling in beef cattle. The posterior means (s.d.) for the overall CC/CT, TT/CT and TT/CC effects were -0.117 (0.079), 0.091 (0.093) and 0.198 (0.100), respectively. The consistency of the sign of these effects under various assumptions further supports an association between TG5 and marbling.
Moreover, the corresponding probabilities that the effects are real (i.e. that the CC/CT effect is less than zero and the TT/CT and TT/CC effects are greater than zero) were 0.935, 0.854 and 0.973, respectively. These are sufficiently large to propose selecting animals based on their TG5 genotype to improve marbling in beef cattle. This association warrants further analysis, particularly of large samples of Bos indicus breeds, such as Brahman, which have low frequencies of the favourable T allele.
The sensitivity analysis showed that the posterior estimates were consistent despite changes in the assumptions underlying the model. The removal of  estimates based on small numbers of animals of one genotype or those measured on the IMF trait tended to increase the posterior probability of association. For example, the posterior probability of association for the TT/CC effect with the Composite 1 breed group rose from 0.939 to 0.953 without the IMF estimates and to 0.952 without the small-sample estimates.
The graphical posterior predictive checks provided further confidence that the data do not contradict the model. The limitations of these checks in confirming the model are acknowledged, in that other reasonable models may also provide equally good fits and lead to different conclusions. Similarly, other representations of the data might be considered, such as vector descriptions of the four traits for each study. The corresponding multivariate analysis would require the estimation of substantial missing data; although this is straightforward in a Bayesian MCMC approach, the gain in interpretation is not immediately clear. A Bayesian analysis allows one to make a variety of probabilistic statements. For example, a threshold value could be set for the overall effect below which there is no practical effect. The posterior probability of the effect being below this threshold can then be calculated. It is also easy to check the other posterior functions of interest, such as the probability distribution of the study means, ranking and comparison of studies and breed groups or the distribution of breed group ranks. See [20] for examples of these types of posterior summaries computed using WinBUGS.
The model presented here is sufficiently flexible to allow structural changes such as non-normal distributional assumptions, proportional representation of breed groups, and additional subgroups. Such changes can be accommodated through the distribution of the likelihood or priors, weights ω s and ω b and the hierarchical structure, respectively. In the present analysis, there was insufficient information available to allow the pursuit of these features.
The breed groups were formed on the basis of average frequency of the T allele but the breeds could be grouped in other ways. Similar groups could also be formed by considering typical time on feed for these breeds, which also affects marbling. Wagyu crosses are typically long-fed, British and European breeds are fed for less time and the Santa Gertrudis and Alexandria composite breeds are fed very briefly. Breed group weights could also be derived from factors such as the typical environment or finishing system (e.g. grass or grainfed) for the breeds included.
Although the model described in Section 2.1 conceptually allows for correlation between estimates within studies and breeds, in light of the lack of information about the size or strength of these, the estimates were taken to be independent. A positive correlation structure would lead to an overstatement of the observed effects, but the degree of overstatement is difficult to assess without further data.
One could consider a fixed effects analysis rather than the random effects approach described here. In the context of the present study, the random effects model seemed appropriate from both genetic and statistical perspectives. Alternatively, as discussed above, different models could be contemplated. For example, one could explicitly describe the probability of no effect via a mixture distribution, with one component being a Dirac delta function placed on the origin and the other a normal distribution.