Email updates

Keep up to date with the latest news and content from BMC Evolutionary Biology and BioMed Central.

Open Access Highly Accessed Research article

A worldwide correlation of lactase persistence phenotype and genotypes

Yuval Itan12*, Bryony L Jones1, Catherine JE Ingram1, Dallas M Swallow1 and Mark G Thomas123

Author Affiliations

1 Research Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK

2 CoMPLEX (Centre for Mathematics & Physics in the Life Sciences and Experimental Biology), University College London, London WC1E 6BT, UK

3 AHRC Centre for the Evolution of Cultural Diversity, Institute of Archaeology, University College London, 31-34 Gordon Square, London WC1H 0PY, UK

For all author emails, please log on.

BMC Evolutionary Biology 2010, 10:36  doi:10.1186/1471-2148-10-36

Published: 9 February 2010

Additional files

Additional file 1:

A table of the lactase persistence phenotype frequencies. Columns show location (continent, country, longitude and latitude), population group, number of individuals tested, frequency of lactase persistent individuals, LP test method, and the primary source reference. The Americas were excluded from the table due to paucity of data. Other reasons for data exclusion were: recent immigrant populations, children (under 12 years old), or biased individuals selection criteria (such as individuals reported being lactase non persistent or related individuals). Wherever only country name was available, location was determined by the capital city or the estimated central point of the country.

Format: XLS Size: 82KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

A table of the lactase persistence associated allele frequencies. Columns show location (continent, country, longitude and latitude), population group, number of individuals tested, frequency of -13910*T, -13,907*G, -13,915*G and -14,010*C LP-associated alleles, the sum of all LP-associated alleles, predicted lactase persistence frequency, and the primary literature and own data source. Data taken from SNP typing tests (where only -13,910*T is shown) or from resequencing. The Americas were excluded from the table due to paucity of data. The predicted lactase persistence frequency was calculated by assuming Hardy-Weinberg equilibrium and dominance using the sum of the all available LP-associated alleles at a specific location. Wherever only country name was available, location was determined by the capital city or the estimated central point of the country. It should be noted that the collection location for the Indian and North Indian genotype data was Singapore. As an exception, we placed these data in the location of the ancestral population because of lack of genetic data from India.

Format: XLS Size: 89KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

A map of the density of sample sites for phenotypic data.

Format: PDF Size: 646KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

A map of the density of sample sites where 13,910*T allele data is available.

Format: PDF Size: 667KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

A map of the density of sample sites where data of all 4 LP-associated alleles is available.

Format: PDF Size: 641KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

Africa and Middle East LP genotype-phenotype correlation, obtained by calculating the quantitative difference between observed phenotype frequency and predicted phenotype frequency based on locations where only fully sequenced data of all 4-LP associated alleles was available. Positive and negative values represent cases of LP-correlated genotype under- and over-predicting the LP phenotype, respectively. Dots represent LP phenotype collection locations, crosses represent data collection locations for all currently known 4 LP-correlated alleles. Colour key shows the values of the predicted LP phenotype frequencies (Figure 4) subtracted from the observed LP phenotype frequencies (Figure 1). The Asia-Pacific data was not analysed since 4 alleles data in these regions is very sparse, and fully sequenced data for western and northern Europe is also sparse.

Format: PDF Size: 3.9MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Old World LP genotype-phenotype correlation, obtained by calculating the quantitative difference between observed phenotype frequency and predicted phenotype frequency based on -13,910*T allele data only. Positive and negative values represent cases of LP-correlated genotype under- and over-predicting the LP phenotype, respectively. Dots represent LP phenotype collection locations, crosses represent data collection locations for the 13,910*T allele obtained from fully sequenced data, and diamonds represent -13,910 C>T only data collection locations. Colour key shows the values of the predicted LP phenotype frequencies predicted by -13,910*T allele data only subtracted from the observed LP phenotype frequencies.

Format: PDF Size: 1.4MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

The difference between the maps of Additional Files 6 and 7, demonstrating the additional knowledge acquired by the 3 additional LP-associated alleles (other than the -13,910*T allele). The Asia-Pacific data was not analysed since 4 alleles data in these regions is very sparse.

Format: PDF Size: 3MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data