Assembling networks of microbial genomes using linear programming
1 Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia, B3 H 1W5, Canada
2 Department of Physics and Atmospheric Science, Dalhousie University, Halifax, Nova Scotia, B3 H 3J5, Canada
3 Institute for Quantum Computing, University of Waterloo 200 University Ave. West, Waterloo, Ontario, N2L 3G1, Canada
BMC Evolutionary Biology 2010, 10:360 doi:10.1186/1471-2148-10-360Published: 20 November 2010
Microbial genomes exhibit complex sets of genetic affinities due to lateral genetic transfer. Assessing the relative contributions of parent-to-offspring inheritance and gene sharing is a vital step in understanding the evolutionary origins and modern-day function of an organism, but recovering and showing these relationships is a challenging problem.
We have developed a new approach that uses linear programming to find between-genome relationships, by treating tables of genetic affinities (here, represented by transformed BLAST e-values) as an optimization problem. Validation trials on simulated data demonstrate the effectiveness of the approach in recovering and representing vertical and lateral relationships among genomes. Application of the technique to a set comprising Aquifex aeolicus and 75 other thermophiles showed an important role for large genomes as 'hubs' in the gene sharing network, and suggested that genes are preferentially shared between organisms with similar optimal growth temperatures. We were also able to discover distinct and common genetic contributors to each sequenced representative of genus Pseudomonas.
The linear programming approach we have developed can serve as an effective inference tool in its own right, and can be an efficient first step in a more-intensive phylogenomic analysis.