Data_Sheet_2_Genomic Prediction of Complex Phenotypes Using Genic Similarity Based Relatedness Matrix.pdf

In the last years, a series of methods for genomic prediction (GP) have been established, and the advantages of GP over pedigree best linear unbiased prediction (BLUP) have been reported. However, the majority of previously proposed GP models are purely based on mathematical considerations while seldom take the abundant biological knowledge into account. Prediction ability of those models largely depends on the consistency between the statistical assumptions and the underlying genetic architectures of traits of interest. In this study, gene annotation information was incorporated into GP models by constructing haplotypes with SNPs mapped to genic regions. Haplotype allele similarity between pairs of individuals was measured through different approaches at single gene level and then converted into whole genome level, which was then treated as a special kernel and used in kernel based GP models. Results shown that the gene annotation guided methods gave higher or at least comparable predictive ability in some traits, especially in the Arabidopsis dataset and the rice breeding population. Compared to SNP models and haplotype models without gene annotation, the gene annotation based models improved the predictive ability by 0.56~26.67% in the Arabidopsis and 1.62~16.53% in the rice breeding population, respectively. However, incorporating gene annotation slightly improved the predictive ability for several traits but did not show any extra gain for the rest traits in a chicken population. In conclusion, integrating gene annotation into GP models could be beneficial for some traits, species, and populations compared to SNP models and haplotype models without gene annotation. However, more studies are yet to be conducted to implicitly investigate the characteristics of these gene annotation guided models.