TargetSearch - a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data
1 Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, D-14476 Potsdam-Golm, Germany
2 Centro de Biotecnología, Universidad Técnica Federico Santa María, General Bari 699 Valparaíso, Chile
3 RIKEN Plant Science Center, Tsurumi-ku, Suehiro-cho, 1-7-22 Yokohama, Kanagawa, 230-0045, Japan
4 Bayer BioScience N.V., Technologiepark 38, B-9052, Gent, Belgium
BMC Bioinformatics 2009, 10:428 doi:10.1186/1471-2105-10-428Published: 16 December 2009
Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks.
We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R.
TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.