Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Oligonucleotide array discovery of polymorphisms in cultivated tomato (Solanum lycopersicum L.) reveals patterns of SNP variation associated with breeding

Sung-Chur Sim1, Matthew D Robbins1, Charles Chilcott2, Tong Zhu2 and David M Francis1*

Author Affiliations

1 Department of Horticulture and Crop Science, The Ohio State University, Ohio Agricultural Research and Development Center, 1680 Madison Ave, Wooster, OH 44691, USA

2 Syngenta Biotechnology, Inc, 3054 East Cornwallis Road, Research Triangle Park, NC 27709, USA

For all author emails, please log on.

BMC Genomics 2009, 10:466  doi:10.1186/1471-2164-10-466

Published: 9 October 2009

Abstract

Background

Cultivated tomato (Solanum lycopersicum L.) has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP) discovery as a high-throughput approach for marker development in cultivated tomato.

Results

Three varieties, FL7600 (fresh-market), OH9242 (processing), and PI114490 (cherry) were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (α) used to define the confidence interval (CI), and ranged from 76% for polymorphisms identified at α ≤ 10-6 to 60% for those identified at α ≤ 10-2. Validation percentage reached a plateau between α ≤ 10-4 and α ≤ 10-7, but failure to identify known SFPs (Type II error) increased dramatically at α ≤ 10-6. Trough sequence validation, we identified 279 SNPs and 27 InDels in 111 loci. Sixty loci contained ≥ 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of θ (Fst) suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace), and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka) to synonymous substitutions (Ks) for 20 loci with multiple SNPs (≥ 4 per locus). Six of 20 loci showed ratios of Ka/Ks ≥ 0.9.

Conclusion

Array-based SFP discovery was an efficient method to identify a large number of molecular markers for genetics and breeding in elite tomato germplasm. Patterns of sequence variation across five major tomato groups provided insight into to the effect of human selection on genetic variation.