Open Access Highly Accessed Research article

Divorcing the Late Upper Palaeolithic demographic histories of mtDNA haplogroups M1 and U6 in Africa

Erwan Pennarun1*, Toomas Kivisild12, Ene Metspalu1, Mait Metspalu1, Tuuli Reisberg1, Jean-Paul Moisan3, Doron M Behar13, Sacha C Jones4 and Richard Villems15

Author Affiliations

1 Estonian Biocentre and Department of Evolutionary Biology, University of Tartu, Tartu, Estonia

2 Division of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom

3 Molecular Medicine Laboratory, Rambam Health Care Campus, Haifa, Israel

4 McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, United Kingdom

5 Estonian Academy of Sciences, Tallinn, Estonia

For all author emails, please log on.

BMC Evolutionary Biology 2012, 12:234  doi:10.1186/1471-2148-12-234

Published: 3 December 2012

Additional files

Additional file 1:

– Coalescent age estimates for M1 and U6 and their most frequent sub-clades. Soaresa: These estimates include some sequences that are not complete, and are given just for indication, see the left panel for estimates based only on complete sequences.

Format: XLSX Size: 16KB Download file

Open Data

Additional file 2:

– Phylogenetic tree based on 105 M1 full sequences. All positions are scored against the RSRS [58] and are transitions, unless followed by a capital letter that marks the resulting transversion. Indels are scored with i or d, heteroplasmies follow the IUB code and reversal to ancestral state by an exclamation mark (!), double back mutations by two exclamation marks (!!). The positions are colour coded according to their status: purple – non-coding; blue – non-synonymous; and black – synonymous. Variations in the C tracts were mostly ignored (i.e., 16182C, 16193C, 309+2C, etc.) unless stated on the tree. The box containing the sample ID is colour coded according to the publications from which they were retrieved (See the main text for the full reference), and below it their geographic origin is colour coded (See Additional file 7 for the specifics). Sequences available only for the coding region, or for which some parts are missing, are flagged with a yellow mark under the geographic origin. The order for the root mutation(s) for M1a1g, M1a1h, M1a7 and M1b2c were deduced from additional partial sequencing (See Additional file 3).

Format: XLSX Size: 41KB Download file

Open Data

Additional file 3:

– Network based on 236 M1 samples. All positions are scored against the RSRS [58] and are transitions, unless followed by a capital letter that marks the resulting transversion. Indels are scored with i or d, heteroplasmies follow the IUB code and reversal to ancestral state by an exclamation mark (!), double back mutations by two exclamation marks (!!). The positions are colour coded according to their status: purple – non-coding; blue – non-synonymous; and black – synonymous. Variations in the C tracts were ignored (i.e., 16182C, 16193C, 309+2C, etc.). The box containing the sample ID is colour coded according to the publications which they are from (See the main text for the full reference), and below it their geographic origin is colour coded (See Additional file 7 for the specifics).

Format: XLSX Size: 49KB Download file

Open Data

Additional file 4:

– Phylogenetic tree based on 139 U6 full sequences. All positions are scored against the RSRS [58] and are transitions, unless followed by a capital letter that marks the resulting transversion. Indels are scored with i or d, heteroplasmies follow the IUB code and reversal to ancestral state by an exclamation mark (!). The positions are colour coded according to their status: purple – non-coding; blue – non-synonymous; and black – synonymous. Variations in the C tracts were mostly ignored (i.e., 16182C, 16193C, 309+2C, etc.) unless stated on the tree. The box containing the sample ID is colour coded according to the publications which they are from (See the main text for the full reference), and below it their geographic origin is colour coded (See Additional file 7 for the specifics). Sequences available only for the coding region are flagged with a yellow mark under the geographic origin. The potential reticulation created by position 150 between sub-clades U6a3a and U6a3c was resolved on the more frequent occurrence of 150 in various different haplogroups’ backgrounds (See [72]). We refined here the phylogeny of the Canary-specific branch formerly known as U6b1 [27,29]. There is an array of 2 common mutations before the branch splits into the so-called Canary-specific branch and one apparently specific to Northwest Africa. We propose therefore to rename U6b1a as U6b1a1 to comply with the revised phylogeny. The mutations order of some clades (U6a1a1b, U6a1a2, U6a2b, U6a2b1, U6a3d, U6a3d1, U6a3d1a, U6a6a, U6b1b1) was deduced for additional partial typing (see Additional file 5).

Format: XLSX Size: 64KB Download file

Open Data

Additional file 5:

– Network based on 230 U6 samples. All positions are scored against the RSRS [58] and are transitions, unless followed by a capital letter that marks the resulting transversion. Indels are scored with i or d, heteroplasmies follow the IUB code and reversal to ancestral state by an exclamation mark (!). The positions are colour coded according to their status: purple – non-coding; blue – non-synonymous; and black – synonymous. Variations in the C tracts were ignored (i.e., 16182C, 16193C, 309+2C, etc.). The box containing the sample ID is colour coded according to the publications which they are from (See the main text for the full reference), and below it their geographic origin is colour coded (See Additional file 7 for the specifics). The reticulation created by position 150 in U6a3’s clades is left unresolved.

Format: XLSX Size: 63KB Download file

Open Data

Additional file 6:

– BSP for U6 based on North African and European sequences separately. For the North African and European sequences, only a few independent runs were done to ascertain that convergence was reached. The 10 convergence runs for all sets of sequences are shown for comparison.

Format: PDF Size: 665KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

– List of the 466 samples. GeoBroad abbreviations are as follow: WA: West Africa; EA: East Africa; EUR: Europe; NE: Near/Middle East; NWA: North-West Africa; NEA: North-East Africa. See the main text for the full references. In the case of 4 samples originally provided by Familly Tree DNA, 3 samples have an identical sequence that matches an entry in GenBank, and as they cannot be differentiated, they have not been separately deposited into GenBank. For the last sample, there are two entries in GenBank with an identical sequence, and thus that sample as well has not been deposited into GenBank.

Format: XLSX Size: 38KB Download file

Open Data

Additional file 8:

– Genotyping information for 153 M1 samples.

Format: XLSX Size: 69KB Download file

Open Data

Additional file 9:

– Genotyping information for 121 U6 samples.

Format: XLSX Size: 90KB Download file

Open Data

Additional file 10:

– 10 independent BSP runs for M1 with 20 groups. All runs were performed using the same parameters.

Format: PDF Size: 894KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 11:

– 10 independent BSP run analyses for U6 with 20 groups. All runs were performed using the same parameters.

Format: PDF Size: 893KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 12:

– BSP for M1 with groups varying from 5 to 50 groups, in increments of 5.

Format: PDF Size: 445KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 13:

– BSP for U6 with groups varying from 5 to 50 groups, in increments of 5.

Format: PDF Size: 459KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 14:

– BSP for M1 with the corrected rate versus uncorrected. The uncorrected rate use a rate of 1,695 x 10-8[34], and the corrected rate was deduced with the deduced rho values from the time, using the calculator from [34].

Format: PDF Size: 350KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 15:

– BSP for U6 with the corrected rate versus uncorrected. The uncorrected rate use a rate of 1,695 × 10-8[34], and the corrected rate was deduced with the deduced rho values from the time, using the calculator from [34].

Format: PDF Size: 347KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data