Figure 6.

Summary of the Corra framework data flow. In the flow chart, the rectangular boxes represent one or more software processing steps, parallelograms represent data, and the cylinder represents databases. The application of Corra begins with the input of data in mzXML format, converted from the raw files from any of various mass spectrometers capable of producing sufficient resolution to resolve isotopic distribution. Features (defined by m/z, retention time, and intensity) for each input LC-MS run are extracted, based on observed isotopic distribution, and with the resultant peak list stored in APML format. Extracted features are then aligned across all LC-MS runs for the dataset in question, with the resultant aligned features list also and stored in the aligned APML format. The xml format of the aligned APML is then parsed into standard R data format, ExpressionSet, prior to statistical analyses. Statistical tests, using linear mixed model, are performed on all the aligned features, together with any relevant biological and technical replicate information in the sample set. The current implementation of Corra has adapted the previously published LC-MS quantification software tools SpecArray [12] and SuperHirn [15] for feature extraction and alignment.

Brusniak et al. BMC Bioinformatics 2008 9:542   doi:10.1186/1471-2105-9-542
