Open Access Highly Accessed Open Badges Research article

Characterisation of the transcriptome of a wild great tit Parus major population by next generation sequencing

Anna W Santure1*, Jake Gratten1, Jim A Mossman1, Ben C Sheldon2 and Jon Slate1

Author Affiliations

1 Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK

2 Edward Grey Institute, Department of Zoology, University of Oxford, Oxford, OX1 3PS, UK

For all author emails, please log on.

BMC Genomics 2011, 12:283  doi:10.1186/1471-2164-12-283

Published: 2 June 2011



The recent development of next generation sequencing technologies has made it possible to generate very large amounts of sequence data in species with little or no genome information. Combined with the large phenotypic databases available for wild and non-model species, these data will provide an unprecedented opportunity to "genomicise" ecological model organisms and establish the genetic basis of quantitative traits in natural populations.


This paper describes the sequencing, de novo assembly and analysis from the transcriptome of eight tissues of ten wild great tits. Approximately 4.6 million sequences and 1.4 billion bases of DNA were generated and assembled into 95,979 contigs, one third of which aligned with known Taeniopygia guttata (zebra finch) and Gallus gallus (chicken) transcripts. The majority (78%) of the remaining contigs aligned within or very close to regions of the zebra finch genome containing known genes, suggesting that they represented precursor mRNA rather than untranscribed genomic DNA. More than 35,000 single nucleotide polymorphisms and 10,000 microsatellite repeats were identified. Eleven percent of contigs were expressed in every tissue, while twenty one percent of contigs were expressed in only one tissue. The function of those contigs with strong evidence for tissue specific expression and contigs expressed in every tissue was inferred from the gene ontology (GO) terms associated with these contigs; heart and pancreas had the highest number of highly tissue specific GO terms (21.4% and 28.5% respectively).


In summary, the transcriptomic data generated in this study will contribute towards efforts to assemble and annotate the great tit genome, as well as providing the markers required to perform gene mapping studies in wild populations.