Open Access Highly Accessed Research article

Comparative BAC end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato

Erwin Datema12, Lukas A Mueller3, Robert Buels3, James J Giovannoni4, Richard GF Visser5, Willem J Stiekema26 and Roeland CHJ van Ham12*

Author Affiliations

1 Applied Bioinformatics, Plant Research International, PO Box 16, 6700 AA, Wageningen, The Netherlands

2 Laboratory of Bioinformatics, Wageningen University, Transitorium, Dreijenlaan 3, 6703 HA Wageningen, The Netherlands

3 Department of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA

4 United States Department of Agriculture and Boyce Thompson Institute for Plant, Research, Cornell University, Ithaca, New York 14853, USA

5 Laboratory of Plant Breeding, Wageningen University, P.O. Box 386, 6700 AJ Wageningen, The Netherlands

6 Centre for BioSystems Genomics (CBSG), PO Box 98, 6700 AB Wageningen, The Netherlands

For all author emails, please log on.

BMC Plant Biology 2008, 8:34  doi:10.1186/1471-2229-8-34

Published: 11 April 2008



Tomato (Solanum lycopersicon) and potato (S. tuberosum) are two economically important crop species, the genomes of which are currently being sequenced. This study presents a first genome-wide analysis of these two species, based on two large collections of BAC end sequences representing approximately 19% of the tomato genome and 10% of the potato genome.


The tomato genome has a higher repeat content than the potato genome, primarily due to a higher number of retrotransposon insertions in the tomato genome. On the other hand, simple sequence repeats are more abundant in potato than in tomato. The two genomes also differ in the frequency distribution of SSR motifs. Based on EST and protein alignments, potato appears to contain up to 6,400 more putative coding regions than tomato. Major gene families such as cytochrome P450 mono-oxygenases and serine-threonine protein kinases are significantly overrepresented in potato, compared to tomato. Moreover, the P450 superfamily appears to have expanded spectacularly in both species compared to Arabidopsis thaliana, suggesting an expanded network of secondary metabolic pathways in the Solanaceae. Both tomato and potato appear to have a low level of microsynteny with A. thaliana. A higher degree of synteny was observed with Populus trichocarpa, specifically in the region between 15.2 and 19.4 Mb on P. trichocarpa chromosome 10.


The findings in this paper present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. When the complete genome sequences of these species become available, whole-genome comparisons and protein- or repeat-family specific studies may shed more light on the observations made here.