EGN: a wizard for construction of gene and genome similarity networks
- Equal contributors
1 Département de sciences biologiques, Institut de recherche en biologie végétale (IRBV), Université de Montréal, Montréal, QC H1X 2B2, Canada
2 Molecular Evolution and Bioinformatics Unit, Department of Biology, National University of Ireland Maynooth, Co. Kildare, Ireland
3 Unité Mixte de Recherche Centre National de la Recherche Scientifique7138, Systématique, Adaptation, Evolution, Université Pierre et Marie Curie, 75005, Paris, France
BMC Evolutionary Biology 2013, 13:146 doi:10.1186/1471-2148-13-146Published: 11 July 2013
Increasingly, similarity networks are being used for evolutionary analyses of molecular datasets. These networks are very useful, in particular for the analysis of gene sharing, lateral gene transfer and for the detection of distant homologs. Currently, such analyses require some computer programming skills due to the limited availability of user-friendly freely distributed software. Consequently, although appealing, the construction and analyses of these networks remain less familiar to biologists than do phylogenetic approaches.
In order to ease the use of similarity networks in the community of evolutionary biologists, we introduce a software program, EGN, that runs under Linux or MacOSX. EGN automates the reconstruction of gene and genome networks from nucleic and proteic sequences. EGN also implements statistics describing genetic diversity in these samples, for various user-defined thresholds of similarities. In the interest of studying the complexity of evolutionary processes affecting microbial evolution, we applied EGN to a dataset of 571,044 proteic sequences from the three domains of life and from mobile elements. We observed that, in Borrelia, plasmids play a different role than in most other eubacteria. Rather than being genetic couriers involved in lateral gene transfer, Borrelia’s plasmids and their genes act as private genetic goods, that contribute to the creation of genetic diversity within their parasitic hosts.
EGN can be used for constructing, analyzing, and mining molecular datasets in evolutionary studies. The program can help increase our knowledge of the processes through which genes from distinct sources and/or from multiple genomes co-evolve in lineages of cellular organisms.