From: Predicting haplotype carriers from SNP genotypes in Bos taurus through linear discriminant analysis
Step | Action |
---|---|
1 | foreach proportion P and SNP-chip C (7K and 54K) do |
2 | for n=1⋯100do |
3 | randomly split the data into 10 subsets of roughly equal size (S={1:10}) |
[10-fold cross-validation] | |
4 | for k=1⋯10do |
5 | use s:s≠k subsets from S to train the model and subset k for validation; |
6 | in the training set: |
7 | - delete monomorphic and collinear SNPs; |
7 | - select the best combinations of SNPs using BSS until p⊂P SNPs are left; |
7 | - use BSS-selected SNPs to classify haplotype carriers with LDA; |
7 | - save SNP discriminant coefficients; |
7 | - compute the average training error rate; |
8 | in the validation set: |
9 | - use BSS-selected SNPs and their discriminant coefficients to classify haplotype carriers; |
9 | - compute the average validation error rate; |