Uncertainty about linkage phases of multiple single nucleotide polymorphisms (SNPs) in heterozygous diploids challenges the identification of specific DNA sequence variants that encode a complex trait. A statistical technique implemented with the EM algorithm has been developed to infer the effects of SNP haplotypes from genotypic data by assuming that one haplotype (called the risk haplotype) performs differently from the rest (called the non-risk haplotype). This assumption simplifies the definition and estimation of genotypic values of diplotypes for a complex trait, but will reduce the power to detect the risk haplotype when non-risk haplotypes contain substantial diversity. In this article, we incorporate general quantitative genetic theory to specify the differentiation of different haplotypes in terms of their genetic control of a complex trait. A model selection procedure is deployed to test the best number and combination of risk haplotypes, thus providing a precise and powerful test of genetic determination in association studies. Our method is derived on the maximum likelihood theory and has been shown through simulation studies to be powerful for the characterization of the genetic architecture of complex quantitative traits.
Keywords: Complex trait, diplotype, haplotype, quantitative genetics, quantitative trait nucleotides