History and structure of the closed pedigreed population of Icelandic Sheepdogs
© Oliehoek et al. 2009
Received: 29 December 2008
Accepted: 6 August 2009
Published: 6 August 2009
Skip to main content
© Oliehoek et al. 2009
Received: 29 December 2008
Accepted: 6 August 2009
Published: 6 August 2009
Dog breeds lose genetic diversity because of high selection pressure. Breeding policies aim to minimize kinship and therefore maintain genetic diversity. However, policies like mean kinship and optimal contributions, might be impractical. Cluster analysis of kinship can elucidate the population structure, since this method divides the population in clusters of related individuals. Kinship-based analyses have been carried out on the entire Icelandic Sheepdog population, a sheep-herding breed.
Analyses showed that despite increasing population size and deliberately transferring dogs, considerable genetic diversity has been lost. When cluster analysis was based on kinships calculated seven generation backwards, as performed in previous studies, results differ markedly from those based on calculations going back to the founder-population, and thus invalidate recommendations based on previous research. When calculated back to the founder-population, kinship-based clustering reveals the distribution of genetic diversity, similarly to strategies using mean kinship.
Although the base population consisted of 36 Icelandic Sheepdog founders, the current diversity is equivalent to that of only 2.2 equally contributing founders with no loss of founder alleles in descendants. The maximum attainable diversity is 4.7, unlikely achievable in a non-supervised breeding population like the Icelandic Sheepdog. Cluster analysis of kinship coefficients can provide a supporting tool to assess the distribution of available genetic diversity for captive population management.
Closed populations with high levels of genetic drift suffer from reduction of genetic diversity. Genetic diversity is essential to maintain the adaptive potential of populations, and confers higher resistance to pathogens. In the end, reduction of genetic diversity causes higher levels of inbreeding, which can cause inbreeding depression as well as high incidences of particular heritable (often recessive) diseases. Managing genetic diversity within populations is necessary to avoid high incidences of deleterious alleles and to preserve adaptive potential.
In managed populations, such as domestic animals, genetic diversity can be maximised by selection according to optimal contributions, giving each reproductive animal a specific contribution for the next generations [1, 2]. However, for many populations, this optimal approach cannot be applied as a breeding strategy, because there is not one single authority that can decide which animals to select for breeding. These populations can still increase their genetic diversity with sub-optimal solutions, which require an overview of the genetic diversity within these populations. Hence, individual breeders need insight in the population structure and in how genetic diversity can be maintained.
Ubbink et al. [3–5] have used cluster analysis of kinship coefficients to elucidate the relational structure of purebred dog populations, and to demonstrate correlation with a genetic disease present in these populations. Instead of 'looking at a large pile of pedigrees' or a table with mean kinships , they used hierarchical cluster analysis to visualise the hitherto unknown structure of pedigreed populations into separate highly related clusters ('family groups') that have a certain level of kinship (relationship) among each other.
A dog breed is an example of an 'unsupervised' closed population  in which mating is only allowed between registered dogs of the same breed. Purebred dogs are subject to strong selection to meet the breed standards. Dog breed populations can go through a permanent reduction of genetic diversity due to three factors: (1) only a small fraction of all pure-bred males and females actually reproduce ; (2) there is an unequal number of litters among reproductive males ; and (3) dog breeds are often fragmented . This permanent reduction of genetic diversity (bottleneck) has resulted in a high incidence of specific genetic diseases in different breeds, and in some breeds most of the animals are affected or carriers . It is now well recognised that genetic diseases are a major threat for purebred dog populations .
Icelandic Sheepdogs are bred in several European countries by many individual breeders. It is well known that the current population of Icelandic Sheepdogs descends almost entirely from only a few founders that were selected from remote areas in Iceland between 1955 and 1965.
In the work presented here, we investigate the amount of genetic diversity lost and the possibilities to maintain or increase genetic diversity within the Icelandic Sheepdog population considered as a typical closed dog population. Furthermore, cluster analysis is evaluated as a tool and for its potential to identify genetic diversity.
We received pedigree data via ISIC  of the population of Icelandic Sheepdogs in the following countries: the Netherlands (725 records), Sweden (1367), Iceland (1654), Germany (153), Norway (774), Denmark (2241) and Finland (113). Pedigree data contained unique ID, father, mother, gender, date of birth, country of birth, and occasionally date of death. Only Iceland had data since 1955. In other countries, breeding started in 1975 or later and most of the data went up to 2002 and some only up to 1998. Except for a few dogs in France, these countries cover the entire Icelandic Sheepdog population. Animals without recorded parents were classified as either (1) 'original founders': animals without any relationship with other founders, documented as such by the kennel clubs, or (2) 'related animals with unknown parents': animals that descend from the 'original founders' or their progeny, but having unknown parentage. Furthermore, some individuals were registered in more than one country. The pedigree data were assembled into a single database table, and animals that were recorded twice were removed based on information on the country of birth. The problem of 'related animals with unknown parents' was solved by assembling all datasets with additional information on parentage from ISIC. After this process, only the original founders had unknown parents. The equivalent complete generations traced for each animal was computed as the sum of the proportion of ancestors known per generation . Until 1998, pedigrees were complete for all countries. A general life expectancy was estimated separately for males and for females from the interval between date of birth of parents and progeny. If date of death was not recorded, it was estimated by life expectancy. All animals born between 1991 and 1998 were considered as the 'current-population'.
where N is the number of candidates and f ij is the kinship between individual i and individual j. The mean kinship of an animal is a measure of the relationship of that individual with a population; animals with a low mean kinship are more valuable for genetic diversity. Mean kinship depends on the population which means that the mean kinship of an animal might change over time when a population changes. In conservation genetics, mean kinship is an important tool to maintain genetic diversity .
The following population diversity measures were used:
Average inbreeding ( ) is the average of the inbreeding coefficient of all candidates. indicates the current risk of inbreeding depression in the current population.
In this work, genetic diversity (N mk ) is defined as the number of equally contributing founders with no random loss of founder alleles in descendants that would be expected to produce the same average mean kinship (and therefore genetic variation) as in the population under study. N mk is expressed on the scale of founder genome equivalents [15, 16] and is calculated by N mk = 1/2 . A lower average mean kinship means a higher genetic diversity and thus a higher capacity to adapt as a population.
In this work, allelic diversity (N AD ) is defined as half the number of distinct alleles that are still present in the population under study if all founder alleles were unique. The number of unique founder alleles that survive each year was determined by genedrop , which was repeated 10.000 times. N AD is also expressed in founder genome equivalents and can therefore be compared with N mk and N OC (see below). For example, if the frequencies of all alleles were equal, N AD would be equal to N mk . N AD monitors the loss of genetic diversity due to extinction of unique (founder-) alleles.
where 1 is a column vector of ones. c OC contains contributions of parents to next generations that would minimise in next generations. However, c OC calculated from Equation 4 can contain negative contributions, which is impossible in practice. When negative contributions were obtained, the most negative contribution was set to zero and vector c OC was recalculated until all contributions were non-negative. N OC is the highest possible N mk and measures the diversity that could be obtained in next generations. N OC will always be equal or higher than N mk and equal or lower than N AD . N OC is relevant in the case of closed populations, since the population can never reach a diversity higher than N OC . Therefore, it monitors the unrestorable loss of genetic diversity.
For each year a 'current population' was defined as all the animals expected to be alive and the following population-parameters were determined: the current population size; the number of progeny born during that year; the number of founder introductions; and the following diversity measures: , , N mk , N OC , N AD (as described above).
Cluster-analysis was performed twice on the current population. (1) The first analysis was based on kinship calculated using the tabular method starting with the founders and then UPGMA was applied for clustering all animals . To determine the most appropriate number of clusters, R 2, the cubic clustering criteria and pseudo- F statistic were all examined (SAS Institute, release 9.1, Cary, NC, USA). These clusters are displayed in a dendrogram, which is referred to as the all-gen-tree. (2) The second cluster-analysis was performed as described by Ubbink et al. . Kinships between all animals were calculated by the path method  until seven generations backwards (instead of the tabular method that includes all generations). Note that if the path method included all the generations, results would be equal to the tabular method. Then, all the animals were clustered using UPGMA. Subsequently all the clusters having an average mean kinship greater or equal to 0.0625 were defined as the final clusters and displayed in a dendrogram. This kinship value of 0.0625 that delimits clusters corresponds with kinship between second degree cousins and was used by Ubbink et al. . This dendrogram is referred to as the 7-gen-tree.
Of the 4680 dogs in the data, 36 did not have any parents registered and were recognised as founders by the breeding organisations. All other dogs in the pedigree file descended from these 36 founders. Most founders lived in Iceland and were registered there, except for four animals that lived in Germany.
The current population contained 2554 dogs and represented 512 unique parent combinations. For dogs in the current population, the most 'distant' founders appeared in their pedigree 10 to 20 generations back (nine to 19 ancestors between the current animal and the founder). The equivalent complete generations  traced was 9.1.
All the animals of the current population can only carry alleles from the 36 founders. In the Icelandic Sheepdog, just three of the 36 founders contributed more than 80% of the alleles of the current population (results not shown). In other words, in about 80% of cases, the pedigree of every animal in the current population will end with one of these three over-represented founders.
Figure 2 has eight points of interest. (1) When 20 founders were selected this resulted in equal, N mk , N OC and N AD (all equal to 20). (2) N mk has decreased since 1955, despite 10 founder introductions up till 1973 and six more after 1979. Each newly introduced founder can potentially increase genetic diversity but clearly in this case, founder introductions have not increased N mk . (3) However, each founder introduction increases N OC and N AD by one. (4) Between 1960 and 1964, N OC and N AD have decreased from 24 to less than 10. This remarkable drop is explained by the fact that most of the 20 founders that were introduced in 1955 only produced one offspring and then died during this period. (5) N mk has strongly decreased from 6.9 in 1967 to 3.2 in 1970. This is contemporaneous with the start of the first population size growth. N OC and N AD did not decrease as much during that period. Therefore, the decrease of N mk is caused by unequal allele frequencies and not by extinction or mixing of unique alleles with over-represented alleles. The strong decrease of N mk is due to a disproportional contribution of a small number of individuals to the future generation. (6) Unequal representation of founder animals in offspring is also responsible for the decrease of N mk during the first years. (7) The distance between N OC and N AD has increased ever since 1963 and reached 5.2 in 1997, which means that it became increasingly difficult to equalise allele frequencies. In other words, 5.2 founder genome equivalents were lost because of unique alleles mixing with over-represented alleles within individuals. Optimal Contribution Selection cannot restore this loss. (8) The difference between N mk and N OC shows that this population has the potential to increase genetic diversity.
Ubbink et al.  have shown that, in their population, the inclusion of five, six or seven generations yielded virtually identical and reproducible results. Hence, Ubbink et al.  have suggested that it is sufficient to calculate kinship seven generations backwards. Based on the substantial difference between the 7-gen-tree and the all-gen-tree in our study, we conclude that this assumption does not hold for the present population. This difference can be explained by the presence of common ancestors that are undetected at five, six or seven generations. An example of such undetected ancestors is given by the strong influence of the three predominant founders. At least 80% of the alleles of the current population descend from these three founders. While these founders dominate the pedigree many generations back, they remain undetected at five, six or seven generations. These three founders, possibly together with other frequently used ancestors, cause the difference between the 7-gen-tree and the all-gen-tree. The cluster analysis based on all generations is therefore a better representation of real kinship.
Diversity measures within each cluster of dendrogram 4
Although genetic diversity (N mk ) of the current population of the Icelandic Sheepdog was only 2.2, the potential diversity (N OC ) was 4.7. In other words, N mk could be increased from 2.2 to N mk = 4.7. However, this value can be achieved within a few generations only if specific animals are used for breeding according to their specific optimal contribution (as in vector: c OC ) as calculated for each of the 2554 animals. Table 1 shows for each cluster in the all-gen-tree: a) the relative size of each cluster toward the current population in percentage and b) the optimal contributions per individual summed per cluster. Table 1 shows that animals within the small clusters E to H, would have to contribute for 12% up to 23% per cluster, while their cluster sizes are smaller than 1% of the total population size. The optimal contribution per animal ranged from zero to 8% (of a total of 100%). In the ideal situation, 2410 animals of the 2554 would not contribute, while 50 animals would contribute for 80% in future generations. This optimal breeding scheme would require a complete control over the population. This scheme based on optimal contributions will most probably not be applied in multi-breeder ('unsupervised') populations like dog breeds because many breeders would not be allowed to breed at all.
The reason why a single large Scandinavian cluster exists is not only due to the founder-effect. Many sheepdog imports from Iceland were carried out to increase diversity ("new blood") within each country. Breeders often think that within one country dogs are more related to each other and belong to the same cluster and they are often unaware that dogs from other countries might also belong to the same cluster. Since importing a dog is a large investment, breeders always selected the 'best dogs' from Iceland. Without knowing, Scandinavian mainland-countries imported highly related dogs time and again. This close relationship was not obvious on the standard pedigree forms given out by studbooks, because they indicate only three or at the most five generations. This lack of knowledge about true kinship among animals explains the occurrence of one large highly related cluster. Undetected relatedness is also the cause for the significant difference between cluster-analysis based on seven or on all generations (Figure 1 and 2). For several generations, related animals appear unrelated because pedigrees only go back three to five generations. Founder and other ancestors from previous generations might contribute significantly to kinship but are not detected at this level.
Mean kinship per animal was calculated for the current population. Figure 8 shows the all-gen-tree dendrogram (as in Figures 6 and 7) with mean kinships per animal displayed in each cluster. Note that mean kinships differ from those in Table 1 where mean kinship was calculated within each cluster. The distance of each cluster to cluster A decreases mean kinship of animals of that cluster. This means that a conservation strategy based on selecting animals from distant clusters would give similar results than that based on selecting animals with a low mean kinship. While selection by optimal contributions is not possible within a multi-breeder population, cluster analysis could help in increasing genetic diversity. Cluster analysis can provide insight in the population structure for individual breeders, which helps to persuade them to select dogs from distant clusters.
In the populations of other breeds studied by Ubbink et al. [3, 4], specific genetic diseases could be linked with some specific clusters and breeders were advised not to use any dogs from a cluster associated with the disease. Table 1 and Figure 8 show that populations might lose more diversity than breeders would expect when such a decision is based on a cluster analysis performed only with seven generations. This emphasizes the importance of including all generations in kinship calculation, or at least as many generations as possible.
Lacy  has recommended to maintain N mk = 20 to guaranty adequate genetic variability. N mk of the Icelandic Sheepdog was only 2.2. Leroy et al.  have found a higher value (N mk = 5.2 to 25) for nine French dog breeds. However, these results are difficult to compare since the correction for 'related animals with unknown parents' was not implemented because they were treated as founders . Głażewska  have reported a founder genome equivalent of 1.3 in Polish hound, which is comparable with the N mk of 1.3 and concludes that Polish hound has a dramatic low level of genetic variability. Overall, it is surprising that, at the time of our study, the Icelandic Sheepdog did not show any genetic disease considering its level of inbreeding. Fortunately, the population size is still increasing, which usually lowers genetic drift.
The overall picture of the Icelandic Sheepdog breed is as follows. The Icelandic Sheepdog breed was built from founders, located on remote areas of Iceland between 1955 and 1970. A good part of the diversity was already lost during the first years of the development of the breed. Figure 2 shows that about 16 of the original 26 founder genomes were lost by 1966. In a recent study  of a subset of 133 dogs born in Iceland, the average inbreeding coefficient was 0.21, which is in agreement with the average inbreeding found in clusters A, B and C (Table 1). Breeding preferentially a few (and often related) animals, led to further reduction of genetic diversity. Thus, the potential diversity of Icelandic Sheepdogs, which was mainly present in animals from Iceland was not disseminated and in fact, decreased even within Iceland. In 1998, the N OC was only 4.7 and genetic diversity was less than half of that and equalled N mk = 2.2. Thus, in other words: the current population had a genetic diversity equal to 2.2 equally contributing founders with no random loss of founder alleles in descendants. An increase of genetic diversity to N mk = 4.7 is not possible within a few generations in a multi-breeder population like the Icelandic Sheepdog.
Breeding with animals having a low mean kinship is an important conservation method . Cluster analysis is consonant with mean kinship: distant clusters contain animals with a low mean kinship and potential diversity within clusters is hardly higher than genetic diversity (Table 1), while within the current population as a whole, potential diversity is almost twice the current diversity. Cluster analysis of kinship coefficient based on all generations reveals the population structure and provides better insight on where to find genetic diversity. The all-gen-tree of Figure 9 shows that the genetically important animals are mainly in Iceland, Holland and Germany. Therefore, cluster analysis is suitable especially for exchanging information on genetic diversity in small closed pedigreed multi-breeder populations.
Although conservation of genetic diversity by means of optimal contribution selection is unlikely to happen within a multi-breeder population, preservation of potential diversity may be the second best option, when few animals are involved. In the Icelandic Sheepdog, optimal contributions show that the number of individuals with the highest potential genetic diversity equals about 50. It remains to be seen whether it is possible to convince some breeders to use those animals for breeding or for cryo-conservation of semen and oocytes.
This research underlines that dog breeds suffer from genetic drift continuously. Often dog breeding is only authorized with animals meeting specific criteria. These selection criteria, like show-qualifications and health status reports, often strongly limit the number of animals used in breeding. Moreover, certain specific animals are genetically important (see also Table 1), but in practice, these animals are often not used at all because they do not meet the previously mentioned selection criteria. Therefore, selection criteria might unintentionally accelerate loss of genetic and/or potential diversity, which is harmful for populations as a whole.
We thank ISIC  for facilitating data connection between Icelandic Sheepdogs among all countries. Furthermore, we would like to thank, Geert Ubbink for calculating the cluster-analysis for seven generations and additional advice on this research.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.