Email updates

Keep up to date with the latest news and content from BMC Evolutionary Biology and BioMed Central.

Open Access Highly Accessed Research article

A genome-wide study of recombination rate variation in Bartonella henselae

Lionel Guy, Björn Nystedt, Yu Sun, Kristina Näslund, Eva C Berglund and Siv GE Andersson*

Author Affiliations

Department of Molecular Evolution, Biomedical Centre, Uppsala University, SE-751 24, Uppsala, Sweden

For all author emails, please log on.

BMC Evolutionary Biology 2012, 12:65  doi:10.1186/1471-2148-12-65

Published: 11 May 2012

Additional files

Additional file 1:

Figure showing the 200-kb amplification in B. henselae IC11. Details of the read coverage of B. henselae IC11 on about 20 kb of contig 87 (A) and 75 kb on contig 97 (B). Dots represent coverage every 100 nt. Red lines represent the average coverage on the region of amplification (upper line) and on the rest of the contig (lower line).

Format: PDF Size: 2.3MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Table of the genes present in IC11 and UGA10, absent from Houston-1.

Format: PDF Size: 54KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Whole-genome phylogeny of B. henselae strains. The phylogeny was obtained using maximum-likelihood methods with GTR + gamma model on the concatenation of all synteny blocks of the 3 B. henselae strains (Houston-1, IC11 and UGA10) and B. quintana Toulouse. The bar represents the number of substitutions per site.

Format: PDF Size: 2KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Single-nucleotide polymorphisms (SNPs) along the genomes of Bartonella henselae Houston-1, IC11 and UGA10. Genes found in Houston-1 are depicted by blue arrows, with their names above. SNPs are depicted below the genes. SNPs uniquely found in Houston-1, IC11 and UGA10 are shown in green, red and blue, respectively. One row represents 100 kb.

Format: PDF Size: 182KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

Genes contained in the single nucleotide polymorphism-rich regions. Numbers refer to regions shown in Figure 2. A region is defined as SNP-rich if at least 5 consecutive 1-kb windows have a SNP frequency equal or above the upper whisker of the frequency distribution.

Format: XLS Size: 36KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

Table of the genomic features of Bartonella henselae IC11 and UGA10.

Format: PDF Size: 55KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Schematic figure describing the triangle plot. Ks values for each pair in the triplet of orthologs are normalized so that their sum is one. Points inside the inner triangle (defined by the three red lines) correspond to all genes for which the highest of the Ks values is equal or inferior to the sum of both others (Ks proportion ≤ 0.5). Points outside that inner triangle are likely to be stochastic variation errors, because one Ks is greater than the sum of the two others. Points within the yellow area are genes for which Ks (AB) is the smallest of the three Ks. Genes on the edges of the inner triangle (red lines) correspond to the genes that have one Ks equal to the sum of both others. In other words, if one tries to reconstruct the substitutions going from A to B, one is likely to obtain C as a step. Thus, if Ks (AB) = x and Ks (BC) = y, then Ks (AC) = x + y, which equals 0.5 as normalized value. Genes on the blue lines have two equal Ks values, the third one being smaller. These correspond to genes having a standard phylogenetic relationship, with A and C being more closely related to each other than to B. Then, Ks (AC) = Ks (BC) > Ks (AB). Homologs on the vertices of the inner triangle (green) have two identical sequences, and thus Ks (AB) = 0 and Ks (AC) = Ks (BC), equaling 0.5 when normalized.

Format: PDF Size: 8KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Distributions of the spread on the triangle plot and of the Ks and nucleotide identity per region. Figure S1: Distance distribution or spread of the normalized Ks values on the triangle plot. Blue line represents the median, red line the mean. Figure S2: Distribution of Ks and nucleotide identity in different subsets of the UGA10 genome. All x-axes are in logarithmic scales. Left panels: nucleotide identity, in percent. Identical gene pairs are not displayed. Right panels: Ks. Gene pairs for which Ks = 0 are not displayed. Upper panels: box-and-whiskers plots. The black dot is the mean of the distribution, the box extends around the quartiles 2 and 3 (percentiles 25 to 75), defining the interquartile distance (IQ). Outliers are all marked with an open circle, and are considered as such if they are smaller than percentile 25–1.5 IQ or greater than the percentile 75 + 1.5 IQ. Whiskers extend between the smallest and the largest non-outlier values. Lower panels: distributions for each group of genes. Individual values are scattered on the bottom of the plot.

Format: PDF Size: 152KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 9:

Genes likely to have undergone recombination. Table S1: Genes whose normalized Ks (Houston-1- IC11) > 0.33, and whose maximum Ks > 0.05. The highest Ks value(s) of the row is in bold. Genes encoding components of the T4SSs have their locus in italics. The locus_tag of the genes included in the run-off replication region is followed by an asterisk. Table S2: Genes containing one or more possible recombinations. GIs and GOs, number of global inner and outer fragments, respectively, as defined by geneconv. Genes encoding components of the T4SSs have their locus in italics. An asterisk follows the loci of the genes included in the run-off replication region.

Format: PDF Size: 82KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 10:

Structure of BH14680 in (A) Bartonella and (B) other Alpha-proteobacteria. Reference sequences are (A) B. henselae Houston-1 and (B) Brucella suis. Additional strains and species included in the analysis are detailed in (A) Table S4 and (B) Table S5 in the Additional file 11. In each panel, the top plot shows the prediction of subcellular location of the reference protein, as predicted by TMHMM. The segment is colored in blue, orange and green if it has a higher probability to be located outside the cell, in the membrane or inside the cell, respectively. In the bottom graph, the approximate mean of the posterior distribution for ω = Ka/Ks is plotted for each site (an estimation of the omega value, given the model, as calculated with the Bayes empirical Bayes in model 2a of PAML). Standard deviation is indicated with a grey line. The color corresponds to the most likely ω class attributed to each site (blue, ω < 1; green, ω = 1; red, ω > 1). The x-axis corresponds to the position along the reference protein.

Format: PDF Size: 56KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 11:

Strains and primers used in sequencing gene BH14680. Table S3 : Bartonella strains used in the analysis of gene BH14680. Table S4: Primers used in sequencing gene BH14680. Table S5: Alpha-proteobacterial species used in the analysis of gene BH14680.

Format: PDF Size: 145KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data