Peptidoglycan: a post-genomic analysis
1 Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UMR CNRS 7872 IRD 198, Méditerranée Infection, Aix-Marseille-Université, Marseille, France
2 Architecture et Fonction des Macromolécules Biologiques, Aix-Marseille Université, CNRS UMR 7257, Marseille, France
3 Evolution Biologique et Modélisation, UMR-CNRS 6632, Université de Provence, Marseille, France
BMC Microbiology 2012, 12:294 doi:10.1186/1471-2180-12-294Published: 18 December 2012
To derive post-genomic, neutral insight into the peptidoglycan (PG) distribution among organisms, we mined 1,644 genomes listed in the Carbohydrate-Active Enzymes database for the presence of a minimal 3-gene set that is necessary for PG metabolism. This gene set consists of one gene from the glycosyltransferase family GT28, one from family GT51 and at least one gene belonging to one of five glycoside hydrolase families (GH23, GH73, GH102, GH103 and GH104).
None of the 103 Viruses or 101 Archaea examined possessed the minimal 3-gene set, but this set was detected in 1/42 of the Eukarya members (Micromonas sp., coding for GT28, GT51 and GH103) and in 1,260/1,398 (90.1%) of Bacteria, with a 100% positive predictive value for the presence of PG. Pearson correlation test showed that GT51 family genes were significantly associated with PG with a value of 0.963 and a p value less than 10-3. This result was confirmed by a phylogenetic comparative analysis showing that the GT51-encoding gene was significantly associated with PG with a Pagel’s score of 60 and 51 (percentage of error close to 0%). Phylogenetic analysis indicated that the GT51 gene history comprised eight loss and one gain events, and suggested a dynamic on-going process.
Genome analysis is a neutral approach to explore prospectively the presence of PG in uncultured, sequenced organisms with high predictive values.