Multivariate Bayesian analysis of Gaussian, right censored Gaussian, ordered categorical and binary traits using Gibbs sampling

A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories) or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed.


INTRODUCTION
In a series of problems, it has been demonstrated that using the Gibbs sampler in conjunction with data augmentation makes it possible to obtain samplingbased estimates of analytically intractable features of posterior distributions. animals with another pathogen. The two binary traits could be dead/alive three weeks after infection. (See e.g. [13] for a similar assumption in a bivariate analysis of two quantitative traits). In other applications and for a number of binary traits greater than one, however, the assumption of independence may be too restrictive. Therefore we also outline a Bayesian analysis using Gibbs sampling in the more general model where residuals associated with liabilities of the binary traits are correlated. (The two models are only different if the number of binary traits is greater than one).
The outline of the paper is the following: in Section 2, a fully Bayesian analysis of an arbitrary number of Gaussian, right censored Gaussian, ordered categorical and binary traits is presented for the particular case where all animals have observed values for all traits, i.e. no missing values. In Section 3, we extend the fully Bayesian analysis to allow for missing observations of the different traits. Strategies for implementation of the Gibbs sampler are given and/or reviewed in Section 4. These include univariate and joint sampling of location parameters, efficient sampling from a multivariate truncated normal distribution -necessary for sampling the augmented data, and sampling from an inverted Wishart distribution and from a conditional inverted Wishart distribution. Note that the conditional inverted Wishart distribution of the residual covariance matrix in the model assuming that residuals associated with liabilities of the binary traits are independent, is different from the conditional inverted Wishart distribution in the model where this assumption has been relaxed (if the number of binary traits is greater than one). The methods presented for obtaining samples from the fully conditional posterior of the residual covariance matrix are different from the method presented in [29]. To illustrate the developed methodology, simulated data are analysed in Section 5 which also outlines a way of choosing suitable starting values for the Gibbs sampler. The paper ends with a conclusion in Section 6.

The sampling model
Assume that m 1 Gaussian traits, m 2 right censored Gaussian traits, m 3 categorical traits with response in multiple ordered categories and m 4 binary traits are observed on each animal; m i ≥ 0, i = 1, . . . , 4. The total number of traits is m = m 1 + m 2 + m 3 + m 4 . In general, the data on animal i are (y i , δ i ), i = 1, . . . , n, where y i = y i1 , . . . , y im 1 , y im 1 +1 , . . . , y im 1 +m 2 , y im 1 +m 2 +1 , . . . . . . , y im 1 +m 2 +m 3 , y im−m 4 +1 , . . . , y im , and where δ i is a m 2 dimensional vector of censoring indicators of the right censored Gaussian traits. The number of animals with records is n and the data on all animals with records are (y, δ). The observed vector of Gaussian traits of the animal i is y i1 , . . . , y im 1 . For j ∈ {m 1 + 1, . . . , m 1 + m 2 }, y ij is the observed value of Y ij = min U ij , C ij , where U ij is normally distributed and C ij is the point of censoring of the jth trait of animal i. The censoring indicator δ ij is one iff U ij is observed U ij ≤ C ij and zero otherwise. ∆ oj and ∆ 1j will denote the sets of animals with δ ij equal to zero and one, respectively, j = m 1 +1, . . . , m 1 +m 2 . The observed vector of categorical traits with response in three or more categories is y im 1 +m 2 +1 , . . . , y im 1 +m 2 +m 3 . The outcome y ij , j ∈ {m 1 + m 2 + 1, . . . , m 1 + m 2 + m 3 }, is assumed to be determined by a grouping in an underlying Gaussian scale, the liability scale. The underlying Gaussian variable is U ij , and the grouping is determined by threshold values. That is, The observed vector of binary traits is y im 1 +m 2 +m 3 +1 , . . . , y im . As for the ordered categorical traits, the observed value is assumed to be determined by a grouping in an underlying Gaussian scale. It is assumed that . . , m 1 , that is for the Gaussian traits, and let U i = (U i1 , . . . , U im ) be the vector of Gaussian traits observed or associated with the right censored Gaussian traits, ordered categorical traits and binary traits of animal i. Define U = (U i ) i=1,...,n as the nm-dimensional column vector containing the U i s. It is assumed that: U| a, b, R = r, R 22 = I m 4 ∼ N nm Xb + Za, I n ⊗ r 11 r 12 r 21 I m 4 (1) where b is a p-dimensional vector of "fixed" effects. The vector a i = (a i1 , . . . , a im ) represents the additive genetic values of U i , i = 1, . . . , N; a = (a i ) i=1,...,N , is the Nm dimensional column vector containing the a i s. N is the total number of animals in the pedigree; i.e. the dimension of the additive genetic relationship matrix, A, is N×N, and r 11 r 12 r 21 I m 4 is the residual covariance matrix of U i in the conditional distribution given a, b, R = r, R 22 = I m 4 . The usual condition that R kk = 1 (e.g. [5]) has been imposed in the conditional probit model of Y ik given b and a, k = m − m 4 + 1, . . . , m. Furthermore it is assumed that liabilities of the binary traits are conditionally independent, given b and a. Note that we (in this section) carefully distinguish between the random (matrix) variable, R, and an outcome, r, of the random (matrix) variable, R (contrary to the way in which e.g. b and a are treated).
With two or more binary traits included in the analysis, however, the assumption of independence between residuals associated with liabilities of the binary traits may be too restrictive. Therefore we also considered the model where it is assumed that: U| a, b, R = r, (R kk = 1) k=m−m 4 +1,...,m ∼ N nm Xb + Za, I n ⊗ r 11 r 12 r 21r22 (2) with r 22 kl = (r 22 ) kl for k, l = m − m 4 + 1, . . . , m with k = l, and r 22 kk = 1 for k, l = m − m 4 + 1, . . . , m.
In the following, first, the model associated with (1) is treated; second, the necessary modifications related to the model in (2) are outlined.

Prior distribution
Let the elements of b be ordered so that the first p 1 elements are regression effects and the remaining p 2 = p−p 1 elements are "fixed"classification effects.
It is assumed, a priori, that b| σ 2 1 , σ 2 2 ∼ N p 0, , where σ 2 1 and σ 2 2 are known (alternatively, it can be assumed, that some elements of b follow a normal distribution and the remaining elements follow an improper uniform distribution). The a priori distribution of the additive genetic values Assuming, for the model associated with (1), that R follows an inverted Wishart distribution: R ∼ IW m (Σ R , f R ), then the prior distribution of R, in the conditional distribution given R 22 = I m 4 , is the conditional inverted Wishart distributed. All of Σ G , f G , Σ R and f R are assumed known. A priori, it is assumed that the elements of τ j = τ j2 , . . . , τ jK j −2 are distributed as order statistics from a uniform distribution in the interval τ j1 ; τ jK j −1 = [0; 1], i.e.: p τ j2 , . . . , Concerning prior independence, the following assumption was made: (a) A priori b, (a, G), R and τ j , j = m 1 + m 2 + 1, . . . , m 1 + m 2 + m 3 are mutually independent, and furthermore, the elements of b are mutually independent.
In the model associated with (2), the prior assumptions were similar except that, a priori, R conditional on (R kk = 1) k=m−m 4 +1,...,m is assumed to follow a conditional inverse Wishart distribution (which for m 4 > 1 is different from the prior given in the model associated with (1)).

Joint posterior distribution
For each animal, the augmented variables are U ij s of right censored δ ij = 0 Gaussian traits and liabilities of ordered categorical and binary traits. The following notation will be used: U RC 0 = U ij : i ∈ ∆ 0j ; j = m 1 + 1, . . . , m 1 + m 2 }, this is the set of U ij s of the censored observations from the right censored Gaussian traits. U CAT and U BIN will denote the sets of liabilities of ordered categorical and binary traits, respectively. The following will be assumed concerning the censoring mechanism: (b) Random censoring conditional on . . , C im 1 +m 2 is the m 2 dimensional random vector of censoring times of animal i, is stochastically independent of U, given ω. (c) Conditional on ω, censoring is noninformative on ω.
Having augmented with U RC 0 , U CAT and U BIN , it then follows that the joint posterior distribution of parameters and augmented data By assumption (a), it follows that the prior distribution of ω, conditional on R 22 = I m 4 , is given by Let x i (m × p) and z i (m × Nm) be the submatrices of X and Z associated with animal i. Then, by assumptions (b) and (c), it follows that is given, up to proportionality, by: (Here the convention is adopted that, e.g., In the model associated with (2) the joint posterior is derived similarly, with obvious modifications.

Marginal posterior distributions, Gibbs sampling and fully conditional posterior distributions
From the joint posterior distribution of ψ, the marginal posterior distribution of ϕ, a single parameter or a subset of parameters of ψ, can be obtained integrating out all the other parameters, ψ \ϕ , including the augmented data. The notation ψ \ϕ denotes ψ excluding ϕ. Here, we wish to obtain samples from the joint posterior distribution of ω = b, a, G, R, τ m 1 +m 2 +1 , . . . , τ m 1 +m 2 +m 3 conditional on R 22 = I m 4 . One possible implementation of the Gibbs sampler is as follows: Given an arbitrary starting value ψ (0) , then (b, a) (1) is generated from the fully conditional posterior distribution of (b, a) given data, (y, δ), ψ \(b,a) and R 22 = I m 4 . Superscript (1) (and later (t) ) refer to the sampling round of the implemented Gibbs sampler. Next, , and so on up to τ (1) m 1 +m 2 +m 3 ,K m 1 +m 2 +m 3 −2 , which is generated from the fully conditional posterior distribution of τ m 1 +m 2 +m 3 ,K m 1 +m 2 +m 3 −2 given data, (y, δ), ψ \ τ K m 1 +m 2 +m 3 −2 and R 22 = I m 4 . This completes one cycle of the Gibbs sampler. After t cycles (t large) Geman and Geman [10] showed that ψ (t) , under mild conditions, can be viewed as a sample from the joint posterior distribution of ψ conditional on R 22 = I m 4 .
The fully conditional posterior distributions that define one possible implementation of the Gibbs sampler are: Let θ = b , a , W = (X, Z), and and and the fully conditional posterior distribution of R conditional on data, ψ \R and R 22 = I m 4 is obtained from by conditioning on R 22 = I m 4 .
The following notation will be used for augmented data of the animal i: The mean and variance of the corresponding normal distribution before truncation are given by and respectively. x i(obs) and x i(aug) are the n obs i × p and n aug i × p dimensional submatrices of x i containing the rows associated with observed and uncensored continuous traits, and those associated with the augmented data of animal i, respectively. Similar definitions are given for z i(obs) and z i(aug) . The dimension of observed and uncensored Gaussian traits, u obs i , is n obs and is the part of R associated with augmented data of animal i. Similar definitions are given for R i(aug)(obs) , R i(obs) and R i(obs)(aug) .
The fully conditional posterior distribution of τ jk for k = 2, . . . , K j − 2 is uniform on the interval Detailed derivations of the fully conditional posterior distributions can be found in, e.g., [15].
In the model associated with (2) the fully conditional posterior distribution of the residual covariance matrix is also conditional inverse Wishart distributed, however the conditioning is on (R kk = 1) k=m−m 4 +1,...,m .

MODEL INCLUDING MISSING DATA
In this section allowance is made for missing data. First the notation is extended to deal with missing data. Let J (i) = (J 1 (i), . . . , J m (i)) be the vector of response indicator random variables on animal i defined by denotes the observed Gaussian, observed right censored Gaussian traits, with their censoring indicators, observed categorical and binary traits of animal i. An animal with a record is now defined as an animal with at least one of m traits observed of the Gaussian, right censored Gaussian, ordered categorical or binary traits. The vector of observed y s of animal i is For missing data, the idea of augmenting with residuals [32] is invoked. It is assumed that The dimensions of U i(obs) , U i(aug) and E i(mis) are n obs i , n aug i and n mis i , respectively, and m = n obs i + n aug i + n mis i . U i(obs) is associated with observed and uncensored Gaussian traits, U i(aug) is associated with augmented data of observed, censored right censored Gaussian and observed ordered categorical and binary traits. E i(mis) is associated with residuals on the Gaussian scale of traits missing on animal i. The following will be assumed concerning the missing data pattern: (d) Conditional on ω, data are missing at random, in the sense that J is stochastically independent of (U, C) conditional on ω. (e) Conditional on ω, J is noninformative of ω.
Under the assumptions (a)-(e), and having augmented with U i(aug) and E i(mis) for all animals (i.e. with U RC 0 , U CAT , U BIN , E MIS ), it then follows that the joint posterior distribution of parameters and augmented data where those rows of x i and z i associated with missing data are zero, and where u ij , for j associated with missing data on animal i, is a residual, e ij . Deriving the fully conditional posterior distributions defining a Gibbs sampler proceeds as in the model with no missing data and with modifications according to the missing data pattern. (This is also true for the model associated with (2)).
Further details related to the derivation of the fully conditional posterior distributions can be found in, e.g., [15].

STRATEGIES FOR IMPLEMENTATION OF THE GIBBS SAMPLER
Strategies for implementation are first outlined for the model associated with (1) for the case without missing data, and where, a priori, b conditional on σ 2 1 and σ 2 2 follows a multivariate normal distribution. The strategy is similar for the model associated with (2) except in obtaining samples from the fully conditional posterior of the residual covariance matrix.

Univariate sampling of location parameters
The fully conditional posterior distribution of θ given data, ψ \θ and R 22 = I m 4 is p + Nm dimensional multivariate normal distributed with mean µ = µ θ and covariance matrix Λ = Λ θ given in (3) and (4) respectively. Let β = (1, . . . , i − 1, i + 1, . . . , p + Nm), then using properties of the multivariate normal distribution and relationships between a matrix and its inverse, it follows, that the fully conditional posterior distribution of each element in θ is: where r i is the ith element of r = W I ⊗ R −1 u and C = Λ −1 is the coefficient matrix of the mixed model equations given by Cµ = r. The solution to these equations is µ = Λr and C iβ θ β = C i θ − C ii θ i , where C i is the ith row of the coefficient matrix and C ii is the ith diagonal element.

Joint sampling of location parameters
Sampling univariately from the fully conditional posterior distribution of each location parameter in turn, may give poor mixing properties. García-Cortés and Sorensen [7] described a method to sample from the joint fully conditional posterior distribution of θ given data, ψ \θ and R 22 = I m 4 , that can avoid inverting the coefficient matrix C = Λ −1 θ of the mixed model equations. The idea behind this joint sampling scheme is that a linear combination of normally distributed random variables again is normally distributed and proceeds as follows: Let b * 1 , b * 2 , a * and e * be sampled independently from N p 1 0, I p 1 σ 2 1 , N p 2 0, I p 2 σ 2 2 , N Nm (0, A ⊗ G) and N nm (0, I n ⊗ R) distributions, respectively. Next let b * = b * 1 , b * 2 and θ * = b * , a * and define u * as Wθ * + e * , then it follows that the linear combination of θ * and e * given by: This is the fully conditional posterior distribution of location parameters, θ, given data and ψ \θ . That is, having sampled θ * and e * , thenθ = Λ θ W I n ⊗ R −1 (u − u * ) can be found solving a set of mixed model equations given by: . Finally θ * is added toθ and the resulting value, θ * +θ, is a sampled vector from the fully conditional posterior distribution of θ given data, ψ \θ and R 22 = I m 4 .

Sampling of augmented data
The fully conditional posterior distribution of augmented Gaussian traits, given data, ψ \(U RC 0 ,U CAT ,U BIN ) and R 22 = I m 4 will be sampled jointly. The dimension of U RC 0 , U CAT , U BIN is n i=1 n aug i . Realising that U aug i s of different animals are independent conditional on "fixed" and random effects, it follows that joint sampling of augmented Gaussian traits can be decomposed into n steps. One step is to sample from the fully conditional posterior distribution of U aug i given (y i , δ i ), ω and R 22 = I m 4 . This is a n aug i -dimensional multivariate truncated Gaussian distribution on the interval given in (6). Before truncation, mean and variance are given by (7) and (8), respectively.
Let ξ and Σ be shorthand notation for the mean and variance of the fully conditional posterior distribution of U aug i before truncation. Then first u aug i1 is sampled from a N 1 (ξ 1 , Σ 11 )-distribution, truncated at the relevant interval. Different ways can be chosen to sample from a univariate truncated N 1 µ, σ 2 -distribution on the interval I = ]s 1 ; s 2 ]. One possibility is to sample independently from the untruncated N 1 µ, σ 2 -distribution and then only accept sampled values that belong to the interval I. Let Y ∼ N 1 µ, σ 2 , if P (Y ∈ I) is very small this procedure is inefficient. The following procedure (e.g. [6]) that avoids rejections is implemented. First x is sampled from a R (0, 1)-distributed random variable, X. Let F Y denote the distribution function of Y, then z given by: is a realised value from the truncated N 1 µ, σ 2 -distribution on I. The proof follows from (9) given below, where Z is the random variable from which z is generated; z is a value between s 1 and s 2 :

Sampling of covariance matrices
The strategy, for obtaining samples from the fully conditional posterior of the residual covariance matrix in the model associated with (1), is presented in Section 4.4.1. For the model associated with (2), the strategy is slightly different and is presented in Section 4.4.2.

Model associated with (1)
The fully conditional posterior distribution of the residual covariance matrix, R, of U i , is conditional inverse Wishart distributed. The conditioning is on a block diagonal submatrix, R 22 , equal to the identity matrix of the inverse Wishart distributed matrix, R = R 11 R 12 R 21 R 22 . Note that if the number of binary traits is equal to zero, the fully conditional posterior of R is inverse Wishart distributed. In order to obtain samples from the conditional inverse Wishart distribution, the approach described in [16] is implemented. The method relies on well-known relationships between a partitioned matrix and its inverse, and properties of Wishart distributions. The method is as follows: 4 , it follows that a matrix sampled from the conditional inverse Wishart distribution of R given R 22 = I m 4 can be obtained in the following way: First v 11 is sampled from the marginal distribution of V 11 . Next t 2 is sampled from the conditional distribution of V −1 11 V 12 given V 11 = v 11 .
is then a realised matrix from the conditional inverse Wishart distribution of R given R 22 = I m 4 . In order to obtain samples from a Wishart distribution, the algorithm of Odell and Feiveson [21] is implemented. The basic idea in their algorithm can be summarised as follows: Let V ∼ W m (Σ, f ) and let LL be a Cholesky factorisation of Σ, i.e. Σ = LL . A realised matrix, v, can be generated from the distribution of V, by sampling w from a W m (I m , f )-distribution, then v given by LwL is a realised matrix from the desired Wishart distribution.
Using successively the properties already given of the Wishart distribution, a realised matrix, w, from W ∼ W m (I m , f ) can be generated as follows: is then a realised matrix from the distribution of W 22 ∼ W 2 (I 2 , f ). For i = 3 and up to m, the dimension of W, we proceed as follows: t i is sampled from W (1:i−1)i is used as the notation for the (i − 1) × 1 -dimensional vector of elements W ji j=1,i−1 of W and W (i−1)(i−1) is the (i − 1)-dimensional square matrix of W, with elements W jk j,k=1,i−1 ;

Model associated with (2)
In the following we outline a method for sampling from the fully conditional posterior distribution of R in the model associated with (2) for m 4 ≥ 1. (Note, if the number of binary traits is equal to zero or one, m 4 = 0 or m 4 = 1, then the model associated with (2) is identical to the model described by (1). Thus for m 4 = 1 we end up with two different methods for obtaining samples from the fully conditional posterior distribution of R). Now consider the partitioning of Obtaining samples from the fully conditional posterior of R 22 given (R kk = 1) k=m−m 4 +1,...,m is not trivial. Therefore, inspired by Chib and Greenberg [4], we suggest the following Metropolis-Hastings algorithm for obtaining samples from the fully conditional posterior distribution of R. Let q 1 r 22 |r 22 denote a density that generates candidate values,r 22 , i.e. candidate correlation matrices given the current value (correlation matrix),r 22 (and (y, δ) , ψ \R ) (see e.g. [18] for generating random correlation matrices). As proposal density, q r|r , for generating candidate values,r (i.e. candidate covariance matrices given the current value/covariance matrix,r) we suggest taking q r|r = p(r| (y, δ) , ψ \R , R 22 = r 22 , (R kk = 1) k=m−m 4 +1,...,m )q 1 r 22 |r 22 .
This results in the following algorithm: 1. Sample a proposal value,r 22 , from the density q 1 r 22 |r 22 . Next sample v 11 and t 2 as described above (and with parameters given in (5)). Then the is a realised matrix from q r|r .
2. Move tor with probability α r,r given by

EXAMPLE
In order to illustrate the methodology, a simulated dataset was analysed. The simulated data and results are presented below.

Simulated data
The simulated data consist of records on five-thousand animals. First the complete data consisting of a Gaussian, a right censored Gaussian, an ordered categorical, and a binary trait are generated for each animal (described in detail below). Next the missing data pattern is generated independently of the random vector associated with the complete data.
The complete data are simulated as follows: First records at the normally distributed level of the model are generated. The animals are assumed to be located in one herd and to be offspring of fifty unrelated sires and fivethousand unrelated dams (all dams and sires of the animals with records are assumed to be mutually unrelated). The fifty 4-dimensional sire effects, s l , l = 1, . . . , 50 are generated independently from a N 4 (0, G S )-distribution. The number of offspring per sire was 100 on average. Residuals were generated independently (e i ∼ N 4 (0, R S ), i = 1, . . . , 5000), so that the ith 4-dimensional normally distributed "record", u i , is equal to: . . , 5000 (these will be called the normally distributed data); where µ H = (8000, 900, 0.5, −0.2562). respectively. The complete data are generated by the following procedure:

Gibbs sampling implementation and starting values
The Gibbs sampler was run as a single chain with joint updating of location parameters. After discarding the first 40 000 rounds of the Gibbs sampler (burnin), 10 000 samples of selected model parameters were saved with a sampling interval of 100. The Gibbs sampler was implemented with improper uniform prior distributions on elements of µ H , and on (co)variance matrices G S and R S . It was assumed that the vector of sire effects, conditional on G S , followed a N 4N (0, I N ⊗ G S )-distribution, with N = 50. Finally the two thresholds τ 2 and τ 3 were a priori assumed to be distributed as order statistics from a uniform distribution in the interval [0, 1], as described in Section 2.2.
Starting values for the location parameters were found as the solution to the mixed model equations given by with initial values for (co)variance matrices inserted and u (0) , being a vector of observed Gaussian traits and starting values for augmented data, given as where σ uo is the standard deviation of uncensored (J 2 (i) = 1 and δ i = 1) observations of trait 2, with τ (0) k = −∞, τ (0) 1 = 0, τ (0) 4 = 1 and τ (0) 5 = ∞ (i.e. by equating observed and expected frequencies in a (generally simpler) model, where liabilities of the ordered categorical trait are assumed to be independent and identically distributed with mean µ and variance σ 2 ).

Post Gibbs analysis and results
For each selected parameter, ψ, let ψ (1) , . . . , ψ (n) denote the saved sampled values from the Gibbs sampler. The marginal posterior mean,ψ PM , and variance,σ 2 PSTD , were estimated by 1 n n i=1 ψ (i) and 1 respectively. The method of batching (e.g. [11]) was chosen for estimating Monte Carlo variance, MCV, and effective sample size, N e . The saved sampled values were divided into B batches (here B = 20) of equal size, n b (here n b = 500). For each batch b, b = 1, . . . , B, the batch mean is given byψ  Table I.
Let c ψ p denote the pth quantile of the (empirical) marginal posterior distribution of ψ. For all of the parameters in Table I  Inferences concerning a subset of the parameters from the present Bayesian analysis were compared with those obtained using restricted maximum likelihood (REML). This comparison is restricted to the covariance matrices associated with "the normally distributed data". The normally distributed data were analysed using the Gibbs sampler and REML [22]. Burn-in, sampling interval and the number of saved sampled values for the Gibbs sampler implemented for analysing the normally distributed data were 4000, 10, and 10 000 respectively. Again improper uniform prior distributions were assumed for elements of µ H , and for (co)variance matrices G S and R S . The results from this part of the analysis are shown in Table II. REML estimates are joint mode estimates of the (joint) marginal posterior distribution of (co)variance matrices. If the (joint) marginal posterior distribution of (co)variance matrices is symmetric, then joint posterior mode estimates and marginal posterior mean estimates would be equal -except for numerical and/or Monte Carlo error. Based on "the normally distributed data", marginal posterior means and REML estimates of genetic correlations are remarkably close to each other. Marginal posterior mean estimates of intraclass correlations are all slightly higher, compared to the REML estimates. This is because the marginal posterior distributions of intraclass correlations are all skewed to the right; i.e. posterior mode estimates are expected to be lower compared to posterior mean estimates.
In conclusion, the Gibbs sampler implementation of the Bayesian analysis of the rather complicated data (model) shows satisfactory behaviour.

CONCLUSION
During the last decade, a major change of emphasis in animal breeding research has taken place. Rather than focusing singly on productivity, there is now an interest in understanding the complex biological and statistical interrelationships among traits related to product quality, disease resistance, behaviour and production. Addressing these problems requires the development of probability models which properly describe the underlying structures in the data perceived by the experimenter. These models are highly complex and often cannot be implemented via traditional methods. However an increase in computer power and the introduction of modern computer-based inference methods are making this implementation possible. In this paper we have developed and implemented a fully Bayesian analysis of Gaussian, right censored Gaussian, categorical and binary traits using the Gibbs sampler and data augmentation. The methodology was applied to analyse a simulated dataset and the results show that posterior distributions cover well the values of the parameters used in the simulations.
The computer programme (available upon request), which has been developed for models associated with (1), allows analyses based on models with several random effects, including maternal genetic effects. In the programme, it is possible to choose between univariate or joint sampling of all location parameters. Augmented data are sampled jointly, using the method of composition, from their truncated multivariate normal distribution. Covariance matrices are sampled from inverted or conditional inverted Wishart distributions depending on the absence or presence of binary traits, respectively. In most applications of models including at least two binary traits, it is not reasonable to assume that the residuals of liabilities of the binary traits are independent, i.e. the model associated with (2) is to be preferred. The Gibbs sampler outlined for the model associated with (2) is almost identical to the one associated with (1); the only real difference is the Metropolis-Hastings step invoked for sampling the residual covariance matrix associated with the residuals of liabilities (this step has not yet been implemented in the programme).