Open Access Highly Accessed Open Badges Research article

A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

Roland Schafleitner1*, Luz R Tincopa1, Omar Palomino2, Genoveva Rossel3, Ronald F Robles3, Rocio Alagon3, Carlos Rivera1, Cynthia Quispe1, Luis Rojas2, Jaime A Pacheco1, Julio Solis4, Diogenes Cerna1, Ji Young Kim1, Jack Hou1 and Reinhard Simon2

Author Affiliations

1 Germplasm Enhancement and Crop Improvement Division, International Potato Center, La Molina, Lima, Peru

2 Research Informatics Unit, International Potato Center, La Molina, Lima, Peru

3 Genetic Resources Conservation and Characterization Division, International Potato Center, La Molina, Lima, Peru

4 Department of Horticulture, Louisiana State University Agricultural Center, Baton Rouge, LA 70803, USA

For all author emails, please log on.

BMC Genomics 2010, 11:604  doi:10.1186/1471-2164-11-604

Published: 26 October 2010



Sweetpotato (Ipomoea batatas (L.) Lam.), a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST) sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species.


Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas) and 2 diploid (I. trifida) accessions.


The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at webcite.