Insights into the evolutionary history of tubercle bacilli as disclosed by genetic rearrangements within a PE_PGRS duplicated gene pair
- Equal contributors
1 Unit of Typing and Genetics of Mycobacteria, Institut Pasteur de Tunis, 13 Place Pasteur, 1002 Tunis-Belvédère, Tunis, Tunisie
2 DST/NRF Centre of Excellence in Biomedical Tuberculosis Research, MRC Centre for Molecular and Cellular Biology, Department of Biomedical Sciences, Faculty of Health Sciences, Stellenbosch University, South Africa
3 Laboratoire de Référence des Mycobactéries, Institut Pasteur, Paris, France
4 Unité de la Tuberculose et des Mycobactéries, Institut Pasteur de Guadeloupe, Guadeloupe
5 Laboratory of Molecular Biology and Diagnosis of Infectious Diseases, Oswaldo Cruz Institute, Brazil
6 Laboratoire de Biologie Clinique, HIA Percy, Clamart, France
7 Instituto de Biotecnologia, INTA, Castelar, Buenos Aires, Argentina
8 Clinical Microbiology Service and the Department of Pathology, Columbia University Medical Center, New York-Presbyterian Hospital, New York, NY, USA
9 Public Health Research Institute (PHRI), Newark, New Jersey, USA
10 Division of International Medicine and Infectious Diseases, Weill Medical College of Cornell University, New York, NY, USA
11 Unit of Mycobacterial genetics, Institut Pasteur, Paris, France
BMC Evolutionary Biology 2006, 6:107 doi:10.1186/1471-2148-6-107Published: 12 December 2006
The highly homologous PE_PGRS (Proline-glutamic acid_polymorphic GC-rich repetitive sequence) genes are members of the PE multigene family which is found only in mycobacteria. PE genes are particularly abundant within the genomes of pathogenic mycobacteria where they seem to have expanded as a result of gene duplication events. PE_PGRS genes are characterized by their high GC content and extensive repetitive sequences, making them prone to recombination events and genetic variability.
Comparative sequence analysis of Mycobacterium tuberculosis genes PE_PGRS17 (Rv0978c) and PE_PGRS18 (Rv0980c) revealed a striking genetic variation associated with this typical tandem duplicate. In comparison to the M. tuberculosis reference strain H37Rv, the variation (named the 12/40 polymorphism) consists of an in-frame 12-bp insertion invariably accompanied by a set of 40 single nucleotide polymorphisms (SNPs) that occurs either in PE_PGRS17 or in both genes. Sequence analysis of the paralogous genes in a representative set of worldwide distributed tubercle bacilli isolates revealed data which supported previously proposed evolutionary scenarios for the M. tuberculosis complex (MTBC) and confirmed the very ancient origin of "M. canettii" and other smooth tubercle bacilli. Strikingly, the identified polymorphism appears to be coincident with the emergence of the post-bottleneck successful clone from which the MTBC expanded. Furthermore, the findings provide direct and clear evidence for the natural occurrence of gene conversion in mycobacteria, which appears to be restricted to modern M. tuberculosis strains.
This study provides a new perspective to explore the molecular events that accompanied the evolution, clonal expansion, and recent diversification of tubercle bacilli.