Open Access Open Badges Research article

High-throughput sequencing of Astrammina rara: Sampling the giant genome of a giant foraminiferan protist

Andrea Habura14*, Yubo Hou2, Andrew A Reilly3 and Samuel S Bowser2

Author affiliations

1 Division of Infectious Disease, Wadsworth Center, New York State Department of Health, PO Box 509, Albany, NY 12201, USA

2 Division of Translational Medicine, Wadsworth Center, New York State Department of Health, PO Box 509, Albany, NY 12201, USA

3 Division of Laboratory Operations, Wadsworth Center, New York State Department of Health, PO Box 509, Albany, NY 12201, USA

4 Department of Biomedical Sciences, School of Public Health, The University at Albany, Empire State Plaza, Albany, NY 12201, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2011, 12:169  doi:10.1186/1471-2164-12-169

Published: 31 March 2011



Foraminiferan protists, which are significant players in most marine ecosystems, are also genetic innovators, harboring unique modifications to proteins that make up the basic eukaryotic cell machinery. Despite their ecological and evolutionary importance, foraminiferan genomes are poorly understood due to the extreme sequence divergence of many genes and the difficulty of obtaining pure samples: exogenous DNA from ingested food or ecto/endo symbionts often vastly exceed the amount of "native" DNA, and foraminiferans cannot be cultured axenically. Few foraminiferal genes have been sequenced from genomic material, although partial sequences of coding regions have been determined by EST studies and mass spectroscopy. The lack of genomic data has impeded evolutionary and cell-biology studies and has also hindered our ability to test ecological hypotheses using genetic tools.


454 sequence analysis was performed on a library derived from whole genome amplification of microdissected nuclei of the Antarctic foraminiferan Astrammina rara. Xenogenomic sequence, which was shown not to be of eukaryotic origin, represented only 12% of the sample. The first foraminiferal examples of important classes of genes, such as tRNA genes, are reported, and we present evidence that sequences of mitochondrial origin have been translocated to the nucleus. The recovery of a 3' UTR and downstream sequence from an actin gene suggests that foraminiferal mRNA processing may have some unusual features. Finally, the presence of a co-purified bacterial genome in the library also permitted the first calculation of the size of a foraminiferal genome by molecular methods, and statistical analysis of sequence from different genomic sources indicates that low-complexity tracts of the genome may be endoreplicated in some stages of the foraminiferal life cycle.


These data provide the first window into genomic organization and genetic control in these organisms, and also complement and expands upon information about foraminiferal genes based on EST projects. The genomic data obtained are informative for environmental and cell-biological studies, and will also be useful for efforts to understand relationships between foraminiferans and other protists.