End-sequencing and characterization of silkworm (Bombyx mori) bacterial artificial chromosome libraries
1 National Institute of Agrobiological Sciences, 1-2 Owashi, Tsukuba, Ibaraki 305-8634, Japan
2 Mitsubishi Space Software Co. Ltd., 1-6-1 Takezono, Tsukuba, Ibaraki 305-0032, Japan
BMC Genomics 2007, 8:314 doi:10.1186/1471-2164-8-314Published: 7 September 2007
We performed large-scale bacterial artificial chromosome (BAC) end-sequencing of two BAC libraries (an EcoRI- and a BamHI-digested library) and conducted an in silico analysis to characterize the obtained sequence data, to make them a useful resource for genomic research on the silkworm (Bombyx mori).
More than 94000 BAC end sequences (BESs), comprising more than 55 Mbp and covering about 10.4% of the silkworm genome, were sequenced. Repeat-sequence analysis with known repeat sequences indicated that the long interspersed nuclear elements (LINEs) were abundant in BamHI BESs, whereas DNA-type elements were abundant in EcoRI BESs. Repeat-sequence analysis revealed that the abundance of LINEs might be due to a GC bias of the restriction sites and that the GC content of silkworm LINEs was higher than that of mammalian LINEs. In a BLAST-based sequence analysis of the BESs against two available whole-genome shotgun sequence data sets, more than 70% of the BESs had a BLAST hit with an identity of ≥ 99%. About 14% of EcoRI BESs and about 8% of BamHI BESs were paired-end clones with unique sequences at both ends. Cluster analysis of the BESs clarified the proportion of BESs containing protein-coding regions.
As a result of this characterization, the identified BESs will be a valuable resource for genomic research on Bombyx mori, for example, as a base for construction of a BAC-based physical map. The use of multiple complementary BAC libraries constructed with different restriction enzymes also makes the BESs a more valuable genomic resource. The GenBank accession numbers of the obtained end sequences are DE283657–DE378560.