Genetic relationships in Spanish dog breeds. II. The analysis of biochemical polymorphism

Summary - The phylogenetic relationships between 10 Spanish dog breeds were studied using the gene frequency values obtained from the electrophoretic analysis of 21 structural genic loci that code for blood-soluble proteins and enzymes. In addition, we studied the genetic differentiation within breeds. In some cases the genetic distances between subpopulations of the same breed were greater than the genetic distances between different breeds. The average between-breed distance has a value of 0.0197 (t 0.0128), with extreme values of D = 0.000 between Gos d’Atura and Podenco lb6rico, and of D = 0.051 for the Mastin Espanol - Ca de Bestian pair. The groupings of Spanish dog breeds obtained in our study from morphological and biochemical data were apparently quite similar. The correlation between enzymatic and morphological distances was, however, low (r = 0.07) and non-significant. The estimates of the divergence times among the 4 ancestral trunks suggest that the ancestral trunks separated independently in a relatively short interval of time, between 30 000 and 55 000 years ago.


INTRODUCTION
The genetic relationships in Spanish dog breeds have been studied in a previous paper with data from morphological characters (Jordana et al, 1992). Nevertheless, these characters have been, over time, under a great pressure of selection, either natural or artificial, this selection having had a great influence in the process of breed differentiation.
Assuming that genetic variability -detected through biochemical polymorphism -is maintained in populations by the equilibrium between mutation and genetic drift (Kimura, 1983), and that this polymorphism has not been deliberately selected by man, the analysis of that variability would give a more precise estimation of the relationships among populations.
Past electrophoretic and immunological studies of blood proteins and enzymes, to understand the genetic relationships among breeds of dog, include: Leone and Anthony, 1966;Tanabe et al, 1974Tanabe et al, , 1977Tanabe et al, , 1978Sugiura et al, 1977;Juneja et al, 1981;and Kobayashi et al, 1987. This paper is a study of the genetic relationships among Spanish dog breeds by the analysis, using electrophoretic techniques, of &dquo;neutral&dquo; structural genes that code for soluble proteins and enzymes of the blood. An analysis of within-breed genetic differentiation is also done starting from a total of 24 subpopulations because significant differences might exist among subpopulations of the same breed, owing to the specific characteristics of some subpopulations (size of flocks, reproductive isolation, etc). This will be useful to interpret and discuss the observed genetic relationships among breeds with more precision.
The resulting enzymatic phylogeny is compared with that which is observed from the analysis of morphological characters (Jordana et al, 1992), to check whether a possible evolutionary parallelism between both types or characters exists.
The breeds have been subdivided into 24 subpopulations to perform the withinbreed analysis of the populations, according to geographical criteria and/or the areas of influence of certain breeders (table I). The 2 subpopulations of the Podenco Canario breed had to be built purely at random to perform the analysis, because there were no data about the origins of the individuals.
A factor analysis of principal components was done using the BMDP-4M program (Frane et al, 1985), to study the relationships among populations with data from the allelic frequencies of the polymorphic loci. These were taken as variables to typify the different populations.
Nei's unbiased distance (a modified version of D for small sample sizes; Nei, 1978) and the Cavalli-Sforza and Edwards' (1967) chord distance have been calculated.
These 2 distances were chosen for the respective construction of phenograms and cladograms, owing to their properties. Nei et al (1983), using a &dquo;known&dquo; simulated phylogeny by computer and assuming a constant rate of molecular evolution, have found that: a), the trees generated using UPG1VIA and Wagner's methods with the Cavalli-Sforza and Edwards' (1967) chord distance produce the most accurate topology of the branches; and b), Nei's (1972Nei's ( , 1978 standard distances gave the best estimation of the branch lengths, when the tree was built up through the UPG1VIA algorithm. Besides that, unlike other distances these distances show a close linear relationship with the number of amino acidic substitutions, which makes them useful to obtain rough estimates of divergence times (Hedges, 1986;Nei, 1987).
A jackknife method (Muller and Ayala, 1982) was also used to calculate Nei's distances among populations, since it gives a more accurate estimation when the range of distances is below 0.1.
The reliability of the constructed phenograms has been evaluated by computing the standard errors (SE) at every point of bifurcation of the tree branches. The evaluation of the SE is important because every point of ramification suggests an important event of speciation or division of the population (Nei et al, 1985). In the same way, in the phenogram obtained with the values of Nei's distances by using the jackknife method, it is possible to make comparisons among clusters, checking whether the difference between the average distance among clusters and the average intracluster distance is significantly greater than zero. The reliability of the bifurcation points is indirectly checked and, with it, the reliability of the topology of the tree. The values of the genetic distances among populations, the phenograms and cladograms, as well as the goodness-of-fit statistics of those dendrograms have been computed by using the BIOSYS-1 program (Swofford and Selander, 1981).

Gene frequencies
A total of 38 electromorphs have been identified whose distribution varied from 1 to 5. Taking as a criterion of polymorphism that of 95%, 10 systems (Gpi, 6-Pgd, Pgm-1, Mdh-s, Mdh-m, G6pd, Pac, Pr, Gc and Pi-3) were found to be monomorphic for all populations. The allele frequencies for each polymorphic locus and breed are shown in table II.

Principal components analysis
In order to infer the possible relationships among populations, either at a breed level or at a subpopulation level, a principal components analysis with 3 factors has been done. The allelic frequencies of 11 polymorphic systems are used, giving a total of 17 independent variables. Table III shows, over the total existing variation and over the total explained variation, the different percent values in decreasing order, of the systems that give more information about breed differentiation. 28.08% of the total explained variance corresponds to the transferrin (Tf) system, followed by the Lap, Pi-1, Alb, Sod, Prt-1, a l -B, Prt-2, Pa-1, Pep-D and Mpi systems.
At the breed level (fig 1), the first 3 factors explain 65.60% of the total variance. Three groups are closely related: Podenco Canario (PC) and Perdiguero de Burgos (PB) populations; Gos d'Atura (GA), Galgo Espafiol (GE) and, less closely related, Podenco Ib6rico (PI); and finally Mastin del Pirineo (MP) and Sabueso Espanol (SE). Mastin Espanol (ME) remains as an isolated population, although it is closer to the group formed by Mastin del Pirineo and Sabueso Espa.nol than to any other group. Although the Ca de Bestiar (CB) population differs from the others, it has a certain relationship with the group formed by Podenco Canario and Perdiguero de Burgos. Podenco Ibicenco (PE) appears clearly differentiated from the rest of the breeds.
When the analysis at the subpopulation level is done (fig 2), the explained total variance on the first 3 axes decreases to 49.83%. The diagram is, approximately, comparable to the one obtained at the breed level. A close relationship among the subpopulations of the Ca de Bestiar, Mastin Espanol, Gos d'Atura, Perdiguero de Burgos, Podenco Canario and Podenco lb6rico breeds is observed. The remaining breeds have a smaller relationship among their subpopulations, which suggests the existence of a certain degree of within-breed genetic differentiation.

Genetic distances and dendrograms
From the values of the gene frequencies of the analyzed loci and by means of the application of several indexes of genetic distance, dendrograms of the Spanish dog breeds have been obtained by 2 different methodologies: cluster analysis and Wagner's method. For the cluster analysis, the UPGMA algorithm (Sneatli and Sokal, 1973) was applied to the distance matrices obtained by using Nei's (1978) index and Cavalli-Sforza and Edwards' (1967) chord distance, respectively. Nei's (1978)  Canario, and the one formed by the rest of the breeds, except Ca de Bestiar, which separates from the hypothetical common trunk very early. Within the second According to Nei et al (1985), when the identity values (I) are higher than 0.9 for most pairs of populations and the average of heterozygosity (H) is high (higher than 0.1), as it is in this case, an overestimation of the values of the variances of the distances is produced. For this reason the distance between breeds has been calculated by a jackknife method (Mueller and Ayala, 1982) in an attempt to correct this bias. The average value obtained by this last method for interracial distance is 0.025 9 (! 0.016 8). The topology of the tree is identical to the topology obtained before by using standard distance values.
As it has been said before, a way to evaluate the stability of the phenograms obtained from Nei's index is to compute the standard errors (SE) at every bifurcation point of the tree branches (Nei et al, 1985). Our results show that the (SE) of all bifurcation points are considerably greater than the length of the branch. This implies that any relationship among OTUs (operative taxonomic units) would be possible within the tree. The same conclusion is reached by using jackknife values in the intra-and intercluster comparisons. Nevertheless, this is not the only criterion to check the stability of a classification, because a classification can be considered as stable if its topology is not altered when new characters and/or new OTUs are included, or when different algorithms of taxonomic resemblance are used (Sokal et al, 1984). In this way, figure 5 shows the relationships among subpopulations. The topology of this tree is nearly the same as the topology obtained at the breed level, with the exception of 3 subpopulations: MP2, PE2 and SE2. Nei's (1978) average intersubpopulational distance is 0.0206 (! 0.0149), the average distance between subpopulations that belong to the same breed being 0.0068 (! 0.0087). The average within-breed distance (  (1967), the cladogram of figure 6 is obtained. The central criterion of this method is that of &dquo;parsimony&dquo;, having the &dquo;maximum parsimony&dquo; when all the OTUs with the minimum possible distance are related. The cladogram is topologically similar to the previous phenograms, which would corroborate the stability of the classification proposed.
When the different breeds are grouped within their hypothetical ancestral trunks (Jordana et al, 1992) by means of a hierarchical analysis taking the breeds as OTUs (Swofford and Selander, 1981), a matrix of distances among ancestral trunks is computed, obtaining an average value of intertrunk distance of 0.022 8 (± 0.013 3). The resultant phenogram (fig 7) shows a well-defined cluster that includes Cf metris-optimae and Cf deinieri; Cf in.ostranzewi and Cf intermedius join afterwards, forming in their turn a new cluster, leaving Ca de Bestiar clearly separated from it (this breed, due to its particular formation (Guasp, 1982), has not been assigned to any specific ancestral trunk).

Genetic differentiation among populations
In this study, the average distance values among subpopulations (0.0206), among breeds (0.019 7), and among ancestral trunks (0.022 8) do not substantially differ from one another. These values are in the range of distances indicated by Nei (1987) for local breeds. It could suggest that there is not enough genetic differentiation among the so-called ancestral trunks to give them the taxonomic rank of subspecies.
From the comparison between tables IV and V, in some cases there is more genetic differentiation between subpopulations of the same breed than between different breeds. This would be the case of the Mastin del Pirineo, Sabueso Espanol and Podenco Ibicenco breeds. Similar situations have been described in other domestic species (Vallejo et al, 1979;Ord6s and San Primitivo, 1986). Nei and Roychoudhury (1982) also point out that the genetic variation among the 3 major human races is sometimes smaller than the variation among subpopulations of the same race.
Theoretically, the divergence between 2 populations can be the result of one or more causes: mutation, geographical and reproductive isolation, natural and/or artificial selection and genetic drift, so it is difficult to determine precisely the possible factors causing the observed within-breed differentiation in Mastin del Pirineo, Sabueso Espauol and Podenco Ibicenco. Nevertheless, genetic drift could be the factor that has contributed the most to the observed within-breed differentiation, owing to the low effective population size in the subpopulations studied. Besides that, in most domestic species the drift process is accelerated, because both sexes are not equally represented, which is especially common in dogs.

Congruence between enzymatic and morphological phylogenies
So far, the different breeds have been classified into their respective hypothetical ancestral trunks by using mainly dental and cranial morphology, and historical and behavioral comparative criteria (Studer, 1901;Antonius, 1922;Villemont et al, 1970;Rousselet-Blanc, 1983).
In a first attempt (Jordana et al, 1992) (Guasp, 1982;Sotillo and Serrano, 1985;Delalix, 1986) impute its origin to crossings among Podencos, Mastiffs and Perdigueroshas not been assigned to any particular ancestral trunk. The phylogenies resulting from the qualitative and quantitative analysis of morphological data confirm this classification (Jordana et al, 1992).
All the enzymatic phylogenies evaluated are similar which supports the stability of the classification obtained using electrophoretic data. Nevertheless, these phylogenies show some differences from the phylogenies obtained using morphological data (Jordana et al, 1992).
By excluding the subpopulation PE2 Baleares from the analysis, a great similarity is observed between enzymatic and morphological phylogenies. This subpopulation was shown to differentiate clearly from all the other subpopulations (see fig 5). With this exclusion (fig 8), the relationships between the Greyhounds (Podencos and Galgo) and Gos d'Atura, and between the Mastines and Sabuesos are more obvious, in a way similar to the morphological analysis.
The breeds whose position shows less congruence with the morphological classification are Ca de Bestiar, Podenco Canario and Perdiguero de Burgos, which form a well defined and separated cluster in the cladogram generated using Wagner's method (fig 6). It is not very probable that these 3 breeds had a common origin, so the explanation for their location in the phylogenetic tree should be searched for in their respective population structures. Studies done on these 3 breeds, referring to the levels of genetic variability (Jordana et al, 1991), have shown that these breeds have suffered important &dquo;bottlenecks&dquo; throughout their history. As a consequence, the genetic distance estimates relative to the other breeds could be more influenced by genetic drift, due to a small population size, than by the real divergence time among them.
In observing the values of distance found with respect to the other breeds and the topology of the trees, this hypothesis is strengthened. It is known that when a population is under the effects of a bottleneck, genetic distances increase quickly (Nei and Roychoudhury, 1982;Nei, 1987). This increase of genetic distances distorts the topology of the evolutionary trees. Besides that, their own history confirms this hypothesis. In the Ca de Bestiar breed we could even assume a founder effect, because this breed had nearly disappeared in the sixties, starting its recovery in the seventies from only 4 males and 2 females (Guasp, 1982). Similar discordances in the interpretation of the evolutionary trees in other species, due to bottlenecks, have been described by Nei and Roychoudhury (1982) in human races, by Chesser (1983) in Cynomys ludovicianus, or Black-Tailed Prairie Dog, and by Gyllensten et al (1983) in the Red European Deer, among others. Nei and Roychoudhury (1982), in their study of human races, support the hypothesis proposed by King and Wilson (1975), that macromolecular and anatomical characteristics of the organisms evolve at independent rates. The faster evolutionary change of the morphological characters is produced by a few gene substitutions, the genes that control these characters being under stronger natural selection in the process of human racial differentiation than the &dquo;average of genes&dquo; . Nevertheless, they are more sceptical about a possible evolutionary parallelism between both types of characters, because they proved that the genetic distances among populations are not always correlated with the morphological differences. Wayne and O'Brien (1986) found a non-significant correlation of r = 0.24 ! 0.1 I in comparing genetic and morphological distances in 15 inbred mouse strains, and concluded that structural gene and morphometric variation of mandible traits are uncoupled between mouse strains. Fitch and Atchley (1987) also concluded that there was no correlation between distances based on single loci and mandible shape, in a study of the divergence in inbred strains of mice. Similarly, Crouau-Roy (1990) found no congruence between biochemical and morphological data in a study of 3 species of troglobitic beetles. Festing and Roderick's (1989) results, however, are in contrast with the results obtained by Wayne and O'Brien (1986). In a study involving 12 inbred strains of mice, Festing and Roderick (1989) found strong and statistically highly significant correlations among all measures of genetic distance, ranging from 0.58 for the comparison of single loci with the logarithm of the Mahalanobis distance based on 24 measurements on 4 bones, to 0.72 for estimates of genetic distance based on single loci and the morphology of the mandible. Wayne and O'Brien (1987) in a study of the enzymatic divergence in 12 genera of the Canidae family, affirm that, in general, qualitative and quantitative morphologic studies of the Canidae (Clutton-Brock et al, 1976;) support the groupings represented in the consensus tree they obtain from enzymatic data. The groupings of Spanish dog breeds obtained in our study from morphological and biochemical data were apparently quite similar, particularly for the populations that have not been long under bottleneck effects. However, the correlation between morphological and enzymatic distances was low (r = 0.07), and non-significant, even excluding from the calculations the populations that suffered strong bottlenecks.
From this study, it can be concluded that the large morphological variability among dog breeds, where the process of differentiation has been strongly accelerated by a great pressure of selection on some characters, has no correspondence with differences at the protein and enzymatic levels, where the genetic differences among breeds are very small. This is in accordance with , who affirms that the domestic dog ( Canis familiaris) is a group which is morphologically diverse but genetically very homogeneous. Estimated times of evolutionary divergence According to the neutral theory (Kimura, 1983), it can be assumed that there is a correlation between evolutionary time and genetic divergence measured by an index of distance such as that of Nei. Also, assuming that through electrophoretic techniques it is possible to detect a third of the amino-acid substitutions in the proteins, the following formula allows us to obtain approximately the time of divergence between 2 populations (Nei, (1987): There are 2 important factors that can distort this estimation. The first is migration between populations, which produces an underestimation of the times of divergence. Migration can be neglected with regard to the dog breeds, because the populations -breedswere closed shortly after their formation. The second is occurrence of bottlenecks, which have a great influence upon the values of the distance with a subsequent overestimation of divergence times. Taking these considerations into account, only (t) values among different ancestral trunks have been estimated. We assume that the errors in the calculation of the distances for the populations affected by bottlenecks are diluted when these breeds are included in their ancestral trunk. The Ca de Bestiar breed, however, has not been used in the calculation of the times of divergence because, on the one hand, it has not been assigned to any particular ancestral trunk, and, on the other hand, it has suffered an extreme founder effect.
The divergence between Cf metris-optimae and Cf leineri would have taken place approximately 30 000 years ago. These 2 trunks would have separated from the common cluster that formed with Cf interrraedius 49 000 years ago, while Cf inostranzewi would have separated 55 000 years ago from the cluster that forms with the other 3 ancestral trunks.
Nei and Roychoudhury (1982) assign a similar time of divergence of the separation from the common trunk of the Caucasoid and Mongoloid human races, approximately 41000 years ago. Negroids would have separated from the common trunk with Caucasoid and Mongoloid approximately 110 000 years ago.
Nevertheless, it must not be overlooked that these divergence times are only indicative, because the associated errors of the distances are fairly large and the estimates depend upon several assumptions. In our case, the divergence times would be overestimated due to the bias implied in the choice of the loci analyzed, because most of the known enzymatic polymorphic loci have been included deliberately. As a consequence, the true divergence times should be lower in magnitude than those presented in this paper.