Pseudogene accumulation in the evolutionary histories of Salmonella enterica serovars Paratyphi A and Typhi
1 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
2 Laboratory of Gastrointestinal Pathogens, Centre for Infections, Health Protection Agency, 61 Colindale Avenue, London, NW9 5HT, UK
3 Aga Khan University, Karachi, Pakistan
BMC Genomics 2009, 10:36 doi:10.1186/1471-2164-10-36Published: 21 January 2009
Of the > 2000 serovars of Salmonella enterica subspecies I, most cause self-limiting gastrointestinal disease in a wide range of mammalian hosts. However, S. enterica serovars Typhi and Paratyphi A are restricted to the human host and cause the similar systemic diseases typhoid and paratyphoid fever. Genome sequence similarity between Paratyphi A and Typhi has been attributed to convergent evolution via relatively recent recombination of a quarter of their genomes. The accumulation of pseudogenes is a key feature of these and other host-adapted pathogens, and overlapping pseudogene complements are evident in Paratyphi A and Typhi.
We report the 4.5 Mbp genome of a clinical isolate of Paratyphi A, strain AKU_12601, completely sequenced using capillary techniques and subsequently checked using Illumina/Solexa resequencing. Comparison with the published genome of Paratyphi A ATCC9150 revealed the two are collinear and highly similar, with 188 single nucleotide polymorphisms and 39 insertions/deletions. A comparative analysis of pseudogene complements of these and two finished Typhi genomes (CT18, Ty2) identified several pseudogenes that had been overlooked in prior genome annotations of one or both serovars, and identified 66 pseudogenes shared between serovars. By determining whether each shared and serovar-specific pseudogene had been recombined between Paratyphi A and Typhi, we found evidence that most pseudogenes have accumulated after the recombination between serovars. We also divided pseudogenes into relative-time groups: ancestral pseudogenes inherited from a common ancestor, pseudogenes recombined between serovars which likely arose between initial divergence and later recombination, serovar-specific pseudogenes arising after recombination but prior to the last evolutionary bottlenecks in each population, and more recent strain-specific pseudogenes.
Recombination and pseudogene-formation have been important mechanisms of genetic convergence between Paratyphi A and Typhi, with most pseudogenes arising independently after extensive recombination between the serovars. The recombination events, along with divergence of and within each serovar, provide a relative time scale for pseudogene-forming mutations, affording rare insights into the progression of functional gene loss associated with host adaptation in Salmonella.