Open Access Highly Accessed Open Badges Research article

Molecular diversity, population structure, and linkage disequilibrium in a worldwide collection of tobacco (Nicotiana tabacum L.) germplasm

Agostino Fricano13, Nicolas Bakaher2, Marcello Del Corvo1, Pietro Piffanelli1, Paolo Donini2, Alessandra Stella1, Nikolai V Ivanov2* and Carlo Pozzi24

Author Affiliations

1 Parco Tecnologico Padano, via Einstein, Loc. Codazza, 26900 Lodi, Italy

2 Philip Morris International R&D, Philip Morris Products SA, Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland

3 Bayer CropScience, Technologiepark 38, 9052 Zwijnaarde, Belgium

4 Fondazione Edmund Mach, 38010 San Michele all'Adige, TN, Italy

For all author emails, please log on.

BMC Genetics 2012, 13:18  doi:10.1186/1471-2156-13-18

Published: 21 March 2012



The goals of our study were to assess the phylogeny and the population structure of tobacco accessions representing a wide range of genetic diversity; identify a subset of accessions as a core collection capturing most of the existing genetic diversity; and estimate, in the tobacco core collection, the extent of linkage disequilibrium (LD) in seven genomic regions using simple sequence repeat (SSR) markers. To this end, a collection of accessions were genotyped with SSR markers. Molecular diversity was evaluated and LD was analyzed across seven regions of the genome.


A genotyping database for 312 tobacco accessions was profiled with 49 SSR markers. Principal Coordinate Analysis (PCoA) and Bayesian cluster analysis revealed structuring of the tobacco population with regard to commercial classes and six main clades were identified, which correspond to "Oriental", Flue-Cured", "Burley", "Dark", "Primitive", and "Other" classes. Pairwise kinship was calculated between accessions, and an overall low level of co-ancestry was observed. A set of 89 genotypes was identified that captured the whole genetic diversity detected at the 49 loci. LD was evaluated on these genotypes, using 422 SSR markers mapping on seven linkage groups. LD was estimated as squared correlation of allele frequencies (r2). The pattern of intrachromosomal LD revealed that in tobacco LD extended up to distances as great as 75 cM with r2 > 0.05 or up to 1 cM with r2 > 0.2. The pattern of LD was clearly dependent on the population structure.


A global population of tobacco is highly structured. Clustering highlights the accessions with the same market class. LD in tobacco extends up to 75 cM and is strongly dependent on the population structure.