Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions
- Equal contributors
1 Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestrasse 1, 79104, Freiburg, Germany
2 Institute of Biology II, Faculty of Biology, University of Freiburg, Schaenzlestrasse 1, 79104, Freiburg, Germany
3 BIOSS - Centre for Biological Signalling Studies, Freiburg, Germany
4 FRIAS - Freiburg Institute for Advanced Studies, Freiburg, Germany
5 Advanced Science Research Center, Kanazawa University, Kanazawa, Japan
6 National Institute for Basic Biology, Okazaki, Japan
7 Faculty of Biology, University of Marburg, Karl-von-Frisch-Str. 8, 35043, Marburg, Germany
8 Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052, Ghent, Belgium
9 Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052, Ghent, Belgium
10 School of life Science, The Graduate University for Advanced Studies, Okazaki 444-8585, Japan
BMC Genomics 2013, 14:498 doi:10.1186/1471-2164-14-498Published: 23 July 2013
The moss Physcomitrella patens as a model species provides an important reference for early-diverging lineages of plants and the release of the genome in 2008 opened the doors to genome-wide studies. The usability of a reference genome greatly depends on the quality of the annotation and the availability of centralized community resources. Therefore, in the light of accumulating evidence for missing genes, fragmentary gene structures, false annotations and a low rate of functional annotations on the original release, we decided to improve the moss genome annotation.
Here, we report the complete moss genome re-annotation (designated V1.6) incorporating the increased transcript availability from a multitude of developmental stages and tissue types. We demonstrate the utility of the improved P. patens genome annotation for comparative genomics and new extensions to the cosmoss.org resource as a central repository for this plant “flagship” genome. The structural annotation of 32,275 protein-coding genes results in 8387 additional loci including 1456 loci with known protein domains or homologs in Plantae. This is the first release to include information on transcript isoforms, suggesting alternative splicing events for at least 10.8% of the loci. Furthermore, this release now also provides information on non-protein-coding loci. Functional annotations were improved regarding quality and coverage, resulting in 58% annotated loci (previously: 41%) that comprise also 7200 additional loci with GO annotations. Access and manual curation of the functional and structural genome annotation is provided via the http://www.cosmoss.org webcite model organism database.
Comparative analysis of gene structure evolution along the green plant lineage provides novel insights, such as a comparatively high number of loci with 5’-UTR introns in the moss. Comparative analysis of functional annotations reveals expansions of moss house-keeping and metabolic genes and further possibly adaptive, lineage-specific expansions and gains including at least 13% orphan genes.