Integration of mate pair sequences to improve shotgun assemblies of flow-sorted chromosome arms of hexaploid wheat
1 Department of Plant and Environmental Sciences, University of Life Sciences, Ås, Norway
2 The Genome Analysis Centre (TGAC), Norwich Research Park, Norwich NR4 7UH, UK
3 Department of Molecular Biology and Genetics, Aarhus University, Forsøgsvej 1, 4200, Slagelse, Denmark
4 Centre of the Region Haná, Institute of Experimental Botany, 77200, Olomouc, Czech Republic
5 Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Ås N-1432, Norway
6 Department of Genetics and Biotechnology, Faculty of Agricultural Sciences, Aarhus University, Tjele 8830, Denmark
BMC Genomics 2013, 14:222 doi:10.1186/1471-2164-14-222Published: 4 April 2013
The assembly of the bread wheat genome sequence is challenging due to allohexaploidy and extreme repeat content (>80%). Isolation of single chromosome arms by flow sorting can be used to overcome the polyploidy problem, but the repeat content cause extreme assembly fragmentation even at a single chromosome level. Long jump paired sequencing data (mate pairs) can help reduce assembly fragmentation by joining multiple contigs into single scaffolds. The aim of this work was to assess how mate pair data generated from multiple displacement amplified DNA of flow-sorted chromosomes affect assembly fragmentation of shotgun assemblies of the wheat chromosomes.
Three mate pair (MP) libraries (2 Kb, 3 Kb, and 5 Kb) were sequenced to a total coverage of 89x and 64x for the short and long arm of chromosome 7B, respectively. Scaffolding using SSPACE improved the 7B assembly contiguity and decreased gene space fragmentation, but the degree of improvement was greatly affected by scaffolding stringency applied. At the lowest stringency the assembly N50 increased by ~7 fold, while at the highest stringency N50 was only increased by ~1.5 fold. Furthermore, a strong positive correlation between estimated scaffold reliability and scaffold assembly stringency was observed. A 7BS scaffold assembly with reduced MP coverage proved that assembly contiguity was affected only to a small degree down to ~50% of the original coverage.
The effect of MP data integration into pair end shotgun assemblies of wheat chromosome was moderate; possibly due to poor contig assembly contiguity, the extreme repeat content of wheat, and the use of amplified chromosomal DNA for MP library construction.