Open Access Highly Accessed Open Badges Research article

De novo transcriptome sequencing in a songbird, the dark-eyed junco (Junco hyemalis): genomic tools for an ecological model system

Mark P Peterson1*, Danielle J Whittaker12, Shruthi Ambreth3, Suhas Sureshchandra3, Aaron Buechlein3, Ram Podicheti3, Jeong-Hyeon Choi34, Zhao Lai35, Keithanne Mockatis3, John Colbourne3, Haixu Tang3 and Ellen D Ketterson1

Author Affiliations

1 Dept. of Biology, Center for Integrated Study of Animal Behavior, Indiana University, Bloomington, IN, USA

2 BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA

3 Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN, USA

4 Cancer Center, Department of Biostatistics, Georgia Health Sciences University, Georgia, IN, USA

5 Greehey Children’s Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA

For all author emails, please log on.

BMC Genomics 2012, 13:305  doi:10.1186/1471-2164-13-305

Published: 9 July 2012



Though genomic-level data are becoming widely available, many of the metazoan species sequenced are laboratory systems whose natural history is not well documented. In contrast, the wide array of species with very well-characterized natural history have, until recently, lacked genomics tools. It is now possible to address significant evolutionary genomics questions by applying high-throughput sequencing to discover the majority of genes for ecologically tractable species, and by subsequently developing microarray platforms from which to investigate gene regulatory networks that function in natural systems. We used GS-FLX Titanium Sequencing (Roche/454-Sequencing) of two normalized libraries of pooled RNA samples to characterize a transcriptome of the dark-eyed junco (Junco hyemalis), a North American sparrow that is a classically studied species in the fields of photoperiodism, speciation, and hormone-mediated behavior.


From a broad pool of RNA sampled from tissues throughout the body of a male and a female junco, we sequenced a total of 434 million nucleotides from 1.17 million reads that were assembled de novo into 31,379 putative transcripts representing 22,765 gene sets covering 35.8 million nucleotides with 12-fold average depth of coverage. Annotation of roughly half of the putative genes was accomplished using sequence similarity, and expression was confirmed for the majority with a preliminary microarray analysis. Of 716 core bilaterian genes, 646 (90 %) were recovered within our characterized gene set. Gene Ontology, orthoDB orthology groups, and KEGG Pathway annotation provide further functional information about the sequences, and 25,781 potential SNPs were identified.


The extensive sequence information returned by this effort adds to the growing store of genomic data on diverse species. The extent of coverage and annotation achieved and confirmation of expression, show that transcriptome sequencing provides useful information for ecological model systems that have historically lacked genomic tools. The junco-specific microarray developed here is allowing investigations of gene expression responses to environmental and hormonal manipulations – extending the historic work on natural history and hormone-mediated phenotypes in this system.

Transcriptome; Aves; pyrosequencing; microarray; Junco; 454 titanium cDNA sequencing; single nucleotide polymorphism.