Open Access Research article

Whitefly (Bemisia tabaci) genome project: analysis of sequenced clones from egg, instar, and adult (viruliferous and non-viruliferous) cDNA libraries

Dena Leshkowitz1, Shirley Gazit2, Eli Reuveni23, Murad Ghanim4, Henryk Czosnek2*, Cindy McKenzie5, Robert L Shatters5 and Judith K Brown6

Author Affiliations

1 The Hebrew University Bioinformatics Unit, The Hebrew University of Jerusalem, Rehovot 76100, Israel

2 The Robert H Smith Institute for Plant Science and Genetics in Agriculture, The Hebrew University of Jerusalem, Rehovot 76100, Israel

3 Mouse biology Programme, EMBL, Monterondo, Roma 00016, Italy

4 Institute of Plant Protection, Department of Entomology, Volcani Center, Bet Dagan 50250, Israel

5 USDA-ARS U.S. Horticultural Research Laboratory, Fort Pierce, FL 34945, USA

6 Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA

For all author emails, please log on.

BMC Genomics 2006, 7:79  doi:10.1186/1471-2164-7-79

Published: 11 April 2006

Abstract

Background

The past three decades have witnessed a dramatic increase in interest in the whitefly Bemisia tabaci, owing to its nature as a taxonomically cryptic species, the damage it causes to a large number of herbaceous plants because of its specialized feeding in the phloem, and to its ability to serve as a vector of plant viruses. Among the most important plant viruses to be transmitted by B. tabaci are those in the genus Begomovirus (family, Geminiviridae). Surprisingly, little is known about the genome of this whitefly. The haploid genome size for male B. tabaci has been estimated to be approximately one billion bp by flow cytometry analysis, about five times the size of the fruitfly Drosophila melanogaster. The genes involved in whitefly development, in host range plasticity, and in begomovirus vector specificity and competency, are unknown.

Results

To address this general shortage of genomic sequence information, we have constructed three cDNA libraries from non-viruliferous whiteflies (eggs, immature instars, and adults) and two from adult insects that fed on tomato plants infected by two geminiviruses: Tomato yellow leaf curl virus (TYLCV) and Tomato mottle virus (ToMoV). In total, the sequence of 18,976 clones was determined. After quality control, and removal of 5,542 clones of mitochondrial origin 9,110 sequences remained which included 3,843 singletons and 1,017 contigs. Comparisons with public databases indicated that the libraries contained genes involved in cellular and developmental processes. In addition, approximately 1,000 bases aligned with the genome of the B. tabaci endosymbiotic bacterium Candidatus Portiera aleyrodidarum, originating primarily from the egg and instar libraries. Apart from the mitochondrial sequences, the longest and most abundant sequence encodes vitellogenin, which originated from whitefly adult libraries, indicating that much of the gene expression in this insect is directed toward the production of eggs.

Conclusion

This is the first functional genomics project involving a hemipteran (Homopteran) insect from the subtropics/tropics. The B. tabaci sequence database now provides an important tool to initiate identification of whitefly genes involved in development, behaviour, and B. tabaci-mediated begomovirus transmission.