Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Haplotype allelic classes for detecting ongoing positive selection

Julie Hussin12, Philippe Nadeau12, Jean-François Lefebvre2 and Damian Labuda123*

Author Affiliations

1 Bioinformatics Program, Department of Biochemistry, Université de Montréal, Montréal, Québec, Canada

2 Research Center, Hôpital Sainte-Justine, Montréal, Québec, Canada

3 Department of Pediatrics, Université de Montréal, Montréal, Québec, Canada H3T 1C8

For all author emails, please log on.

BMC Bioinformatics 2010, 11:65  doi:10.1186/1471-2105-11-65

Published: 28 January 2010

Abstract

Background

Natural selection eliminates detrimental and favors advantageous phenotypes. This process leaves characteristic signatures in underlying genomic segments that can be recognized through deviations in allelic or haplotypic frequency spectra. To provide an identifiable signature of recent positive selection that can be detected by comparison with the background distribution, we introduced a new way of looking at genomic polymorphisms: haplotype allelic classes.

Results

The model combines segregating sites and haplotypic information in order to reveal useful data characteristics. We developed a summary statistic, Svd, to compare the distribution of the haplotypes carrying the selected allele with the distribution of the remaining ones. Coalescence simulations are used to study the distributions under standard population models assuming neutrality, demographic scenarios and selection models. To test, in practice, haplotype allelic class performance and the derived statistic in capturing deviation from neutrality due to positive selection, we analyzed haplotypic variation in detail in the locus of lactase persistence in the three HapMap Phase II populations.

Conclusions

We showed that the Svd statistic is less sensitive than other tests to confounding factors such as demography or recombination. Our approach succeeds in identifying candidate loci, such as the lactase-persistence locus, as targets of strong positive selection and provides a new tool complementary to other tests to study natural selection in genomic data.