Scientists have sequenced and assembled the genome of the loblolly pine (Pinus taeda) – at 22.18 billion base pairs (Gbp), it is the longest genome yet sequenced.
In papers published in the open access journal Genome Biology and in the March edition of GENETICS, a team of scientists from the Loblolly Pine Genome Project have described the sequence and the new methods used to generate it.
The genomes of conifers like the loblolly pine are notoriously complex and difficult to sequence. At 22.18 Gbp, the loblolly pine sequence is far longer than the human genome’s 3 Gbp, and the 20 Gbp of the previous record holder, the Norway spruce. The sequence is also of a much higher quality than previous conifer genome assemblies due to the innovative method the researchers used.
The researchers used a ‘whole genome shotgun’ approach, sequencing lots of short parts of the genome, and then fitting them back together. Longer genomes present even more of a challenge with this method, so the team developed a way of pre-processing the fragments before they were put back together, which allowed them to cope with a genome of this size.
By piecing together fragments into slightly larger chunks, and discarding the extra sequence, the researchers were able to assemble it into the entire genome. The method compares to assembling smaller parts of a jigsaw and discarding duplicate pieces before putting the whole thing together. They were also able to separately assemble the mitochondrial DNA of the conifer.
The team found that around 82% of the sequence comprises repetitive elements. They also identified previously unknown genes involved in regulating important traits such as disease resistance, stress response and wood formation.
Professor David Neale from University of California, who led the Loblolly Pine Genome Project, says: “This particular genome presented a real challenge, and we worked really hard to produce something of quality in response to that challenge. We released a first draft of the genome in June 2012, which people have already been using, and now we're proud that in publishing the finalised sequence, we were faithful to the meaning of open access."
T: +44 (0)20 3192 2429
M: +44 (0)7867 410 262
Notes to Editor
1. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies
DB Neale, JL Wegrzyn, KA Stevens, A Zimin, D Puiu, M Crepeau, C Cardeno, M Koriabine, AE Holtz-Morris, JD Liechty, PJ Martínez-García, HA Vasquez-Gross, BY Lin, JJ Zieve, WM Dougherty, S Fuentes-Soriano, L Wu, D Gilbert, G Marçais, M Roberts, C Holt, M Yandell, JM Davis, K Smith, JFD Dean, WW Lorenz, RW Whetten R Sederoff, N Wheeler, PE McGuire, D Main, CA Loopstra, K Mockaitis, PJ deJong, JA Yorke, SL Salzberg, and CH Langley
Genome Biology 2014, 15:R59
Article available at journal website:
Please name the journal in any story you write. If you are writing for the web, please link to the article. All articles are available free of charge, according to BioMed Central's open access policy.
This paper will be published alongside the following articles in GENETICS
Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation
JL Wegrzyn, JD Liechty, KA Stevens, L Wu, CA Loopstra, H Vasquez-Gross, WM Dougherty, BY Lin, JJ Zieve, PJ Martínez-García, C Holt, M Yandell, A Zimin, JA Yorke, M Crepeau, D Puiu, SL Salzberg, PJ de Jong, K Mockaitis, D Main, CH Langley, and DB Neale
Genetics March 2014 196: 891-909; doi:10.1534/genetics.113.159996
Article available at journal website
Sequencing and assembling the 22-Gb loblolly pine genome
A Zimin, KA Stevens, M Crepeau, A Holtz-Morris, M Koriabine, G Marçais, D Puiu, M Roberts, JL Wegrzyn, PJ de Jong, DB Neale, SL Salzberg, JA Yorke, and CH Langley
Genetics March 2014 196: 875-890; doi:10.1534/genetics.113.159715
Article available at journal website
2. Genome Biology publishes research articles, new methods and software tools, in addition to reviews and opinions, from the full spectrum of biology, including molecular, cellular, organism or population biology studied from a genomic perspective, as well as sequence analysis, bioinformatics, proteomics, comparative biology and evolution. @GenomeBiology
3. BioMed Central (http://www.biomedcentral.com/) is an STM (Science, Technology and Medicine) publisher which has pioneered the open access publishing model. All peer-reviewed research articles published by BioMed Central are made immediately and freely accessible online, and are licensed to allow redistribution and reuse. BioMed Central is part of Springer Science+Business Media, a leading global publisher in the STM sector. @BioMedCentral