Open Access Highly Accessed Open Badges Research article

Transcriptome-scale homoeolog-specific transcript assemblies of bread wheat

Andreas W Schreiber13*, Matthew J Hayden2, Kerrie L Forrest2, Stephan L Kong2, Peter Langridge1 and Ute Baumann1

Author affiliations

1 Australian Centre for Plant Functional Genomics, Univ. of Adelaide, PMB 1 Glen Osmond, SA 5064, Australia

2 Department of Primary Industries Victoria, Victorian AgriBiosciences Centre, La Trobe Research and Development Park, Bundoora, VIC 3083, Australia

3 ACRF South Australian Cancer Genome Facility, SA Pathology, Frome Road, Adelaide, SA 5000, Australia

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:492  doi:10.1186/1471-2164-13-492

Published: 19 September 2012



Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome.


After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome.


This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms.

Wheat transcriptome; Wheat genes; Sequence assembly; Cloud computing