Compared genomics of the strand switch region of Leishmania chromosome 1 reveal a novel genus-specific gene and conserved structural features and sequence motifs
1 CNRS/Université Montpellier I FRE 3013 "Biologie Moléculaire, Biologie Cellulaire et Biodiversité des Protozoaires Parasites", Laboratoire de Parasitologie-Mycologie, UFR Médecine, 163 Rue Auguste Broussonet, 34090 Montpellier, France
2 Service de Génétique Médicale, Centre Hospitalier Universitaire, Montpellier, France
BMC Genomics 2007, 8:57 doi:10.1186/1471-2164-8-57Published: 24 February 2007
Trypanosomatids exhibit a unique gene organization into large directional gene clusters (DGCs) in opposite directions. The transcription "strand switch region" (SSR) separating the two large DGCs that constitute chromosome 1 of Leishmania major has been the subject of several studies and speculations. Thus, it has been suspected of being the single replication origin of the chromosome, the transcription initiation site for both DGCs or even a centromere. Here, we have used an inter-species compared genomics approach on this locus in order to try to identify conserved features or motifs indicative of a putative function.
We isolated, and compared the structure and nucleotide sequence of, this SSR in 15 widely divergent species of Leishmania and Sauroleishmania. As regards its intrachromosomal position, size and AT content, the general structure of this SSR appears extremely stable among species, which is another demonstration of the remarkable structural stability of these genomes at the evolutionary level. Sequence alignments showed several interesting features. Overall, only 30% of nucleotide positions were conserved in the SSR among the 15 species, versus 74% and 62% in the 5' parts of the adjacent XPP and PAXP genes, respectively. However, nucleotide divergences were not distributed homogeneously along this sequence. Thus, a central fragment of approximately 440 bp exhibited 54% of identity among the 15 species. This fragment actually represents a new Leishmania-specific CDS of unknown function which had been overlooked since the annotation of this chromosome. The encoded protein comprises two trans-membrane domains and is classified in the "structural protein" GO category. We cloned this novel gene and expressed it as a recombinant green fluorescent protein-fused version, which showed its localisation to the endoplasmic reticulum. The whole of these data shorten the actual SSR to an 887-bp segment as compared with the original 1.6 kb. In the rest of the SSR, the percentage of identity was much lower, around 22%. Interestingly, the 72-bp fragment where the putatively single transcription initiation site of chromosome 1 was identified is located in a low-conservation portion of the SSR and is itself highly polymorphic amongst species. Nevertheless, it is highly C-rich and presents a unique poly(C) tract in the same position in all species.
This inter-specific comparative study, the first of its kind, (a) allowed to reveal a novel genus-specific gene and (b) identified a conserved poly(C) tract in the otherwise highly polymorphic region containing the putative transcription initiation site. This allows hypothesising an intervention of poly(C)-binding proteins known elsewhere to be involved in transcriptional control.