As stated in the introduction, in genomic breeding value estimation we are faced with the problem of estimating many effects from a limited number of observations, and, additionally, many effects show collinearities due to the LD between the SNPs. The BLUP model overcomes these problems by treating the predictors as random variables and estimating them simultaneously. In the nonparametric kernel regressions (ELM and ULM), the numerous effects are estimable by smoothing the phenotypes against one predictor at a time, assuming that the effects of the remaining are removed from the phenotypes. Of course, the true effects of the remaining predictors are unknown and have to be estimated themselves, resulting in the iterative backfitting algorithm [5]. Nuisance factors can be included in the algorithm and can be estimated parametrically using least squares. The model is then semiparametric and the backfitting algorithm iterates between the parametric (*i.e*. estimating the effects of the nuisance factors by least squares) and the nonparametric part (*i.e*. estimating the SNP function values by the Nadaraya-Watson regression), without changing the general structure of the algorithm [5].

Using kernel regression, the choice of the appropriate degree of smoothing is important, which depends on the sample size. Naturally, if the sample size grows to infinity, smoothing is almost not required [7] and hence *λ* should be close to 1. However, sample size is never infinite, and, therefore, *λ* has to be chosen carefully, taking the sample size into account. Indeed, in ELM the optimal *λ* for a marker density of 1 cM, a heritability of 0.5 and applying the allelic model is 0.74 (Figure 1a). If the size of the data set would only be 500, the optimal *λ* would be 0.65 (not shown elsewhere). The applied bootstrap strategy takes the sample size into account, because the estimation set is of equal size as the full data set. In ELM the *λ* determined by bootstrapping was very close to the optimal *λ*. This can be seen by comparing the results reported in Table 2 for the ELM with the maximum achievable accuracies shown in Figures 1 and 2. Alternatively, leave-one-out cross validation is suggested [13, 7]. Using this method, for a given *λ*, the functions are fitted using all but one observation and then the prediction error of this observation is calculated given the fitted functions. This is repeated for all observations. The *λ*, which produces the lowest average prediction error, is chosen to be the optimal *λ*. However, this strategy would require running *n* times the analysis, which would computationally be too demanding in the present data sets. The bootstrap as applied in this study is related to this cross-validation strategy, see [13] for a detailed discussion.

When nuisance factors are included in the model and the number of data points in some classes is very low, it might happen that in some bootstrap samples these effects are not estimable or estimated poorly. One obvious solution is to use only those bootstrap samples where the number of data points in each class is above a defined threshold. Since it is assumed that the nuisance effects and the SNP effects are independent, this would not affect the results regarding the choice of the appropriate *λ*.

From Figures 1 and 2 it can be seen that the regression coefficient was on average highest when the degree of smoothing was at maximum and decreased monotonically with a decrease of the degree of smoothing (higher *λ*), as expected. The crossing point of the regression plots with one (*i.e*. the unbiased estimation point) shown in the bottom of these figures coincided with the maximum accuracy (top of the figures). The plot of the accuracy against *λ* did not show a pronounced maximum. Hence, ELM was not very sensitive with regard to the choice of *λ*. The optimal *λ* depended on the marker density. With increasing density, more smoothing (*i.e*. a lower *λ*) was required. This is because the QTL effects are represented by all SNPs that are in LD with it. With an increasing number of SNP being in LD with the QTL, each SNP captures a smaller part of the QTL effect, and hence, requires more smoothing. Naturally, the number of SNP in LD with the QTL is higher in high marker density situations. Additionally, with increasing number of SNP, more SNP show by chance spurious effects, and hence, more smoothing is required to minimise the impact of these spurious effects. In this study the markers were equally distributed across the chromosomes. In practise it might happen that this is not the case and some QTL are in LD with many markers (requires more smoothing) whereas others only with few markers (requires less smoothing). It can be assumed that ULM might cope with unequal marker densities better than ELM and BLUP, because of the group-wise specific *λ* estimation.

The results from the allelic BLUP and the allelic ELM are very similar (Tables 1 and 2). This might be intuitively surprising, because of the different assumptions underlying these models. However, we compared both models formally and found close similarities between them, leading to the similar results. For details see Appendix 2. BLUP needs estimates of variance components whereas ELM needs a *λ*. For additive genetic variances reliable estimates of variance components are usual available, *e.g*. from REML analysis. However, this is in general not the case for nonadditive genetic variance components like dominance or epistasis. As reviewed by Thaller *et al*. [18], dominance QTL effects are not negligible. The nonparametric regression models allow the inclusion of dominance effects without having knowledge of the dominance variance component. A simulation study could show the benefit of taking dominance into account. However, for a realistic simulation knowledge of the distribution of QTL dominance effects is needed. This is largely unknown up to now. More research is needed in this field.

Meuwissen *et al*. [1] stated that the main disadvantage of BLUP is the assumption that every predictor is associated with the same genetic variance leading to a too strong regression of large QTL, which limits the accuracies of the *EBVs*. The same holds true for ELM, where the degree of smoothing is too strong for predictors linked to large QTL. ULM overcomes the problem of too strong smoothing of predictors with large QTL by building groups of *m* predictors showing similar variance of their function values and determining different *λ* for each group. Hence it is assumed that predictors that show a large variance are linked to large QTL. Indeed, in ULM the amount of smoothing is substantially reduced for many predictors (Tables 3 and 4), resulting in the higher accuracies of the *EBVs* estimated by ULM (Tables 1 and 2). The standard deviations in Tables 3 and 4 are high for *λ* > 0.7. This might be due to the difficulty in finding the optimal *λ* and additionally due to the unequal distribution of the simulated QTL effects. As described above, these followed gamma distribution with a high density for small and a low density for large effects [15]. Hence, some replicates might show several big QTL resulting in more predictors with a large *λ* whereas other replicates might show only small or medium sized QTL and the number of predictors with a *λ* close to one is small in these replicates as well.

In ULM a critical question is how large the group size (*m*) should be. If *m* is too small (*e.g. m* = 1 or 2) then only those predictors which are linked to very large QTL would receive a *λ* above that determined by ELM, because only these might be able to decrease the *aveRSS* during the grid search of *λ*. In contrast, if *m* is too large (*e.g. m* = 100 or 200), then many predictors containing only small QTL would receive a too large *λ*, because they are in a group with predictors with large QTL. Both situations would result in less accurate estimates. It seems that the group size chosen in this study (*m* in between 5 and 20, depending on the marker density) is an appropriate choice. The algorithm defining the group-wise *λ* was stopped when all predictors have passed it one time (see end of section 2.2). Alternatively the algorithm could have been repeated several times with updated *λ* and stopped when the *λ* did not change anymore, which would be, however, computationally very demanding.

It may be possible to estimate *λ* by the use of a prior distribution in ULM. One possibility for such a procedure would be to sample *λ* from a mixture of two distributions, one for predictors in LD with a QTL and the second component of the mixture for predictors not associated with a QTL. The latter distribution would put significantly more, if not all, probability mass at *λ* equal to 0.5 (smoothing is at maximum), whereas the first one would support less smoothing. However, as the models were implemented in this study, they do not use any prior information, in contrast to BayesB of Meuwissen *et al*. [1]. A comparison of the results presented in Table 2 with those of Solberg *et al*. [2], who simulated the same genetic configuration but applied BayesB, suggests that the accuracy of ULM is lower compared to the accuracies of BayesB in the allelic case.