Open Access Open Badges Research article

Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

Chengcang Wu14, Dina Proestou2, Dorothy Carter2, Erica Nicholson2, Filippe Santos1, Shaying Zhao3, Hong-Bin Zhang1* and Marian R Goldsmith2*

Author Affiliations

1 Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA

2 Department of Biological Sciences, University of Rhode Island, Kingston, RI 02881-0816, USA

3 The Institute for Genomic Research, 9712 Medical Center Dr, Rockville, MD 20850, USA

4 Current address: Lucigen Corporation, 2120 West Greenview Dr, Middleton, WI 53562, USA

For all author emails, please log on.

BMC Genomics 2009, 10:283  doi:10.1186/1471-2164-10-283

Published: 26 June 2009



Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date.


We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences.


The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes.