Analysis of BAC-end sequences in rainbow trout: Content characterization and assessment of synteny between trout and other fish genomes
1 INRA, UMR 1313 GABI, Génétique Animale et Biologie Intégrative, 78350 Jouy-en-Josas, France
2 INRA, UMR 444 ENVT Génétique Cellulaire, 31326 Castanet-Tolosan, France
3 INRA, Sigenae, 31326 Castanet-Tolosan, France
4 National Center for Cool and Cold Water Aquaculture, ARS-USDA, 11861 Leetown Road, Kearneysville, WV 25430, USA
5 CEA Genoscope, 2 rue Gaston Crémieux, 91057 Evry Cedex, France
BMC Genomics 2011, 12:314 doi:10.1186/1471-2164-12-314Published: 14 June 2011
Rainbow trout (Oncorhynchus mykiss) are cultivated worldwide for aquaculture production and are widely used as a model species to gain knowledge of many aspects of fish biology. The common ancestor of the salmonids experienced a whole genome duplication event, making extant salmonids such as the rainbow trout an excellent model for studying the evolution of tetraploidization and re-diploidization in vertebrates. However, the lack of a reference genome sequence hampers research progress for both academic and applied purposes. In order to enrich the genomic tools already available in this species and provide further insight on the complexity of its genome, we sequenced a large number of rainbow trout BAC-end sequences (BES) and characterized their contents.
A total of 176,485 high quality BES, were generated, representing approximately 4% of the trout genome. BES analyses identified 6,848 simple sequence repeats (SSRs), of which 3,854 had high quality flanking sequences for PCR primers design. The first rainbow trout repeat elements database (INRA RT rep1.0) containing 735 putative repeat elements was developed, and identified almost 59.5% of the BES database in base-pairs as repetitive sequence. Approximately 55% of the BES reads (97,846) had more than 100 base pairs of contiguous non-repetitive sequences. The fractions of the 97,846 non-repetitive trout BES reads that had significant BLASTN hits against the zebrafish, medaka and stickleback genome databases were 15%, 16.2% and 17.9%, respectively, while the fractions of the non-repetitive BES reads that had significant BLASTX hits against the zebrafish, medaka, and stickleback protein databases were 10.7%, 9.5% and 9.5%, respectively. Comparative genomics using paired BAC-ends revealed several regions of conserved synteny across all the fish species analyzed in this study.
The characterization of BES provided insights on the rainbow trout genome. The discovery of specific repeat elements will facilitate analyses of sequence content (e.g. for SNPs discovery and for transcriptome characterization) and future genome sequence assemblies. The numerous microsatellites will facilitate integration of the linkage and physical maps and serve as valuable resource for fine mapping QTL and positional cloning of genes affecting aquaculture production traits. Furthermore, comparative genomics through BES can be used for identifying positional candidate genes from QTL mapping studies, aid in future assembly of a reference genome sequence and elucidating sequence content and complexity in the rainbow trout genome.