Detecting negative selection on recurrent mutations using gene genealogy
1 Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
2 Present address: Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
3 Present address: Institute of Genomic Microbiology, Heinrich-Heine University Düsseldorf, Universitätsstr. 1, Düsseldorf 40225, Germany
BMC Genetics 2013, 14:37 doi:10.1186/1471-2156-14-37Published: 7 May 2013
Whether or not a mutant allele in a population is under selection is an important issue in population genetics, and various neutrality tests have been invented so far to detect selection. However, detection of negative selection has been notoriously difficult, partly because negatively selected alleles are usually rare in the population and have little impact on either population dynamics or the shape of the gene genealogy. Recently, through studies of genetic disorders and genome-wide analyses, many structural variations were shown to occur recurrently in the population. Such “recurrent mutations” might be revealed as deleterious by exploiting the signal of negative selection in the gene genealogy enhanced by their recurrence.
Motivated by the above idea, we devised two new test statistics. One is the total number of mutants at a recurrently mutating locus among sampled sequences, which is tested conditionally on the number of forward mutations mapped on the sequence genealogy. The other is the size of the most common class of identical-by-descent mutants in the sample, again tested conditionally on the number of forward mutations mapped on the sequence genealogy. To examine the performance of these two tests, we simulated recurrently mutated loci each flanked by sites with neutral single nucleotide polymorphisms (SNPs), with no recombination. Using neutral recurrent mutations as null models, we attempted to detect deleterious recurrent mutations. Our analyses demonstrated high powers of our new tests under constant population size, as well as their moderate power to detect selection in expanding populations. We also devised a new maximum parsimony algorithm that, given the states of the sampled sequences at a recurrently mutating locus and an incompletely resolved genealogy, enumerates mutation histories with a minimum number of mutations while partially resolving genealogical relationships when necessary.
With their considerably high powers to detect negative selection, our new neutrality tests may open new venues for dealing with the population genetics of recurrent mutations as well as help identifying some types of genetic disorders that may have escaped identification by currently existing methods.