Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

Software for selecting the most informative sets of genomic loci for multi-target microbial typing

Matthew VN O’Sullivan*, Vitali Sintchenko and Gwendolyn L Gilbert

Author Affiliations

Centre for Infectious Diseases and Microbiology and Sydney Institute for Emerging Infections and Biosecurity, University of Sydney, Westmead Hospital, Hawkesbury Road, Westmead, NSW 2145, Australia

For all author emails, please log on.

BMC Bioinformatics 2013, 14:148  doi:10.1186/1471-2105-14-148

Published: 1 May 2013

Abstract

Background

High-throughput sequencing can identify numerous potential genomic targets for microbial strain typing, but identification of the most informative combinations requires the use of computational screening tools. This paper describes novel software – Automated Selection of Typing Target Subsets (AuSeTTS) - that allows intelligent selection of optimal targets for pathogen strain typing. The objective of this software is to maximise both discriminatory power, using Simpson’s index of diversity (D), and concordance with existing typing methods, using the adjusted Wallace coefficient (AW). The program interrogates molecular typing results for panels of isolates, based on large target sets, and iteratively examines each target, one-by-one, to determine the most informative subset.

Results

AuSeTTS was evaluated using three target sets: 51 binary targets (13 toxin genes, 16 phage-related loci and 22 SCCmec elements), used for multilocus typing of 153 methicillin-resistant Staphylococcus aureus (MRSA) isolates; 17 MLVA loci in 502 Streptococcus pneumoniae isolates from the MLVA database (http://www.mlva.eu webcite) and 12 MLST loci for 98 Cryptococcus spp. isolates.

The maximum D for MRSA, 0.984, was achieved with a subset of 20 targets and a D value of 0.954 with 7 targets. Twelve targets predicted MLST with a maximum AW of 0.9994. All 17 S. pneumoniae MLVA targets were required to achieve maximum D of 0.997, but 4 targets reached D of 0.990. Twelve targets predicted pneumococcal serotype with a maximum AW of 0.899 and 9 predicted MLST with maximum AW of 0.963. Eight of the 12 MLST loci were sufficient to achieve the maximum D of 0.963 for Cryptococcus spp.

Conclusions

Computerised analysis with AuSeTTS allows rapid selection of the most discriminatory targets for incorporation into typing schemes. Output of the program is presented in both tabular and graphical formats and the software is available for free download from http://www.cidmpublichealth.org/pages/ausetts.html webcite.

Keywords:
Comparative genomics; Multilocus sequence typing; MVLA; Binary typing; Software; Microbial typing; MRSA; Cryptococcus; Staphylococcus aureus; Streptococcus pneumoniae