Open Access Highly Accessed Research article

Fosmid library end sequencing reveals a rarely known genome structure of marine shrimp Penaeus monodon

Shiao-Wei Huang1, You-Yu Lin1, En-Min You1, Tze-Tze Liu2, Hung-Yu Shu2, Keh-Ming Wu3, Shih-Feng Tsai3, Chu-Fang Lo1, Guang-Hsiung Kou1, Gwo-Chin Ma4, Ming Chen145, Dongying Wu67, Takashi Aoki8, Ikuo Hirono8 and Hon-Tsen Yu1*

Author Affiliations

1 Institute of Zoology and Department of Life Science, National Taiwan University, Taipei 10617, Taiwan, ROC

2 Genome Research Center, National Yang-Ming University, Taipei 112, Taiwan, ROC

3 Division of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli County 350, Taiwan, ROC

4 Center for Medical Genetics, and Genetics Laboratory, Department of Medical Research, Changhua Christian Hospital, Changhua 500, Taiwan, ROC

5 Department of Obstertrics and Gynecology, College of Medicine, National Taiwan University, Taipei 106, Taiwan

6 Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA

7 Department of Evolution and Ecology, University of California Davis Genome Center, University of California, Davis, California 95616, USA

8 Laboratory of Genome Science, Tokyo University of Marine Science and Technology, Konan 4-5-7 Minato-ku, Tokyo 108-8477, Japan

For all author emails, please log on.

BMC Genomics 2011, 12:242  doi:10.1186/1471-2164-12-242

Published: 17 May 2011



The black tiger shrimp (Penaeus monodon) is one of the most important aquaculture species in the world, representing the crustacean lineage which possesses the greatest species diversity among marine invertebrates. Yet, we barely know anything about their genomic structure. To understand the organization and evolution of the P. monodon genome, a fosmid library consisting of 288,000 colonies and was constructed, equivalent to 5.3-fold coverage of the 2.17 Gb genome. Approximately 11.1 Mb of fosmid end sequences (FESs) from 20,926 non-redundant reads representing 0.45% of the P. monodon genome were obtained for repetitive and protein-coding sequence analyses.


We found that microsatellite sequences were highly abundant in the P. monodon genome, comprising 8.3% of the total length. The density and the average length of microsatellites were evidently higher in comparison to those of other taxa. AT-rich microsatellite motifs, especially poly (AT) and poly (AAT), were the most abundant. High abundance of microsatellite sequences were also found in the transcribed regions. Furthermore, via self-BlastN analysis we identified 103 novel repetitive element families which were categorized into four groups, i.e., 33 WSSV-like repeats, 14 retrotransposons, 5 gene-like repeats, and 51 unannotated repeats. Overall, various types of repeats comprise 51.18% of the P. monodon genome in length. Approximately 7.4% of the FESs contained protein-coding sequences, and the Inhibitor of Apoptosis Protein (IAP) gene and the Innexin 3 gene homologues appear to be present in high abundance in the P. monodon genome.


The redundancy of various repeat types in the P. monodon genome illustrates its highly repetitive nature. In particular, long and dense microsatellite sequences as well as abundant WSSV-like sequences highlight the uniqueness of genome organization of penaeid shrimp from those of other taxa. These results provide substantial improvement to our current knowledge not only for shrimp but also for marine crustaceans of large genome size.