Analysis and comparison of the pan-genomic properties of sixteen well-characterized bacterial genera
- Equal contributors
1 Department of Computer Science, University of Saskatchewan, 176 Thorvaldson Building, 110 Science Place, Saskatoon, Saskatchewan, S7N 5C9, Canada
2 Department of Pathology and Laboratory Medicine, University of Saskatchewan, 2841 Royal University Hospital, 103 Hospital Drive, Saskatoon, Saskatchewan, S7N 0W8, Canada
3 Saskatchewan Research Council, 125-15 Innovation Boulevard, Saskatoon, Saskatchewan, S7N 2X8, Canada
BMC Microbiology 2010, 10:258 doi:10.1186/1471-2180-10-258Published: 13 October 2010
The increasing availability of whole genome sequences allows the gene or protein content of different organisms to be compared, leading to burgeoning interest in the relatively new subfield of pan-genomics. However, while several studies have analyzed protein content relationships in specific groups of bacteria, there has yet to be a study that provides a general characterization of protein content relationships in a broad range of bacteria.
A variation on reciprocal BLAST hits was used to infer relationships among proteins in several groups of bacteria, and data regarding protein conservation and uniqueness in different bacterial genera are reported in terms of "core proteomes", "unique proteomes", and "singlets". We also analyzed the relationship between protein content similarity and the percent identity of the 16S rRNA gene in pairs of bacterial isolates from the same genus, and found that the strength of this relationship varied substantially depending on the genus, perhaps reflecting different rates of genome evolution and/or horizontal gene transfer. Finally, core proteomes and unique proteomes were used to study the proteomic cohesiveness of several bacterial species, revealing that some bacterial species had little cohesiveness in their protein content, with some having fewer proteins unique to that species than randomly-chosen sets of isolates from the same genus.
The results described in this study aid our understanding of protein content relationships in different bacterial groups, allowing us to make further inferences regarding genome-environment relationships, genome evolution, and the soundness of existing taxonomic classifications.