Selective constraint, background selection, and mutation accumulation variability within and between human populations
1 Sainte Justine Research Centre, Department of Pediatrics, University of Montreal, 3175 Chemin de la Cote-Sainte-Catherine, Montreal H3T 1C5, Canada
2 Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, 1700 4th Street, San Francisco, San Francisco, CA 94158, USA
BMC Genomics 2013, 14:495 doi:10.1186/1471-2164-14-495Published: 23 July 2013
Regions of the genome that are under evolutionary constraint across multiple species have previously been used to identify functional sequences in the human genome. Furthermore, it is known that there is an inverse relationship between evolutionary constraint and the allele frequency of a mutation segregating in human populations, implying a direct relationship between interspecies divergence and fitness in humans. Here we utilise this relationship to test differences in the accumulation of putatively deleterious mutations both between populations and on the individual level.
Using whole genome and exome sequencing data from Phase 1 of the 1000 Genome Project for 1,092 individuals from 14 worldwide populations we show that minor allele frequency (MAF) varies as a function of constraint around both coding regions and non-coding sites genome-wide, implying that negative, rather than positive, selection primarily drives the distribution of alleles among individuals via background selection. We find a strong relationship between effective population size and the depth of depression in MAF around the most conserved genes, suggesting that populations with smaller effective size are carrying more deleterious mutations, which also translates into higher genetic load when considering the number of putatively deleterious alleles segregating within each population. Finally, given the extreme richness of the data, we are now able to classify individual genomes by the accumulation of mutations at functional sites using high coverage 1000 Genomes data. Using this approach we detect differences between ‘healthy’ individuals within populations for the distributions of putatively deleterious rare alleles they are carrying.
These findings demonstrate the extent of background selection in the human genome and highlight the role of population history in shaping patterns of diversity between human individuals. Furthermore, we provide a framework for the utility of personal genomic data for the study of genetic fitness and diseases.