Abstract
Background
Analysis of the plethora of metabolites found in the NMR spectra of biological fluids or tissues requires data complexity to be simplified. We present a graphical user interface (GUI) for NMRbased metabonomic analysis. The "Metabonomic Package" has been developed for metabonomics research as opensource software and uses the R statistical libraries.
Results
The package offers the following options:
Raw 1dimensional spectra processing: phase, baseline correction and normalization.
Importing processed spectra.
Including/excluding spectral ranges, optional binning and bucketing, detection and alignment of peaks.
Sorting of metabolites based on their ability to discriminate, metabolite selection, and outlier identification.
Multivariate unsupervised analysis: principal components analysis (PCA).
Multivariate supervised analysis: partial least squares (PLS), linear discriminant analysis (LDA), knearest neighbor classification.
Neural networks.
Visualization and overlapping of spectra.
Plot values of the chemical shift position for different samples.
Furthermore, the "Metabonomic" GUI includes a console to enable other kinds of analyses and to take advantage of all R statistical tools.
Conclusion
We made complex multivariate analysis userfriendly for both experienced and novice users, which could help to expand the use of NMRbased metabonomics.
Background
Decoding the genome (genomics) is not sufficient to explain the cause of many diseases. Therefore, the study of differences in gene expression between subjects (transcriptomics), the analysis of protein synthesis (proteomics), and the study of metabolic regulation (metabolomics) have been intensified in recent years [1].
Analysis of the plethora of metabolites found in the NMR spectra of biological fluids or tissues requires data complexity to be reduced [2,3]. The field of metabonomics is evolving in parallel to the application of multivariate statistical methods with this purpose.
However, multivariate analysis is not easy for novice users. Several commercial programs can help such users apply multivariate methods, although none include the full range of routines, from data pre and postprocessing to the final statistical results. Recently, an opensource platform (Automics) [4] based on Visual C++ has been developed to carry out a full NMRbased metabonomic analysis. Automics includes the most common 1D NMR spectral processing functions and nine statistical methods: feature selection (Fisher's criterion), data reduction (PCA, LDA, uncorrelated LDA), unsupervised clustering (KMeans) and supervised regression and classification methods (PLS/PLSDA, KNN, Soft Independent Modellingof Class Analogy [SIMCA], Support Vector Machines [SVM]).
We present a new software package based on the opensource R framework [5] with a graphical user interface (GUI) that helps the user understand and run such methods for the analysis of NMRbased metabonomic data. Our package is called "Metabonomic" and it makes use of different R libraries to build a statistics toolbox. Moreover, the R framework opensource architecture allows newly proposed algorithms or methods for spectral processing and data analysis to be implemented and included much more easily and freely accessed by the public. The "Metabonomic" GUI includes unsupervised multivariate analysis techniques (eg, principal components analysis [PCA]), supervised multivariate analysis (eg, partial least squares [PLS] analysis, linear discriminant analysis (LDA), and knearest neighbor classification). It can also be used to define different types of neural networks. In our study, we test some of these multivariate methods using internal crossvalidation and external validation.
This "Metabonomic" package also enables preprocessing of raw NMR spectra. Preprocessing transforms the data in such a way that subsequent analysis and modelling are easier, more robust, and more accurate. In the analysis of NMR spectra, preprocessing methods usually attempt to reduce variance and any other possible source of bias such as phase correction, peak shifting or misalignment, and baseline correction. Although the "Metabonomic" package has been developed for the analysis of NMR spectra, this software can also be used for the preprocessing of mass spectrometrybased profiles or other 1dimensional spectra. The analysis of 2dimensional NMR spectra will be available in the next software update.
Implementation
Program Description
The "Metabonomic" GUI was designed using the RTcl/Tk interface [6,7], which enables us to use the TK toolkit and replace Tcl code with R function calls to facilitate interaction with the R functions and a comprehensive metabonomic analysis. The software offers several graphic outputs, through plots created using a combination of different Tcl/Tk interfaces. The program is based on R version 2.8.0 [5] under the Windows operating system.
The "Metabonomic" GUI, requires packages [Table 1] to be downloaded and installed in the R console. The PROcess package can be found at the Bioconductor Project Site [8]. Once the required packages are ready, the "Metabonomic" package is loaded using the Package installer or writing ">require (Metabonomic)" if the package is already in the computer.
Table 1. Packages required to execute the Metabonomic GUI
The program is started by writing "> Metabonomic()" in the R console to open the main user interface. The GUI has an input console, which can be used to launch any R application, and two different output consoles, where warnings and output messages are displayed. It also has a button line, with the following buttons: (a) undo, (b) redo, (c) current data display, (d) launch the commands written in the input console, (e) erase the input console, (f) stop any running process, and (g) shut down the GUI and return to the R console.
Finally, the GUI has a main menu with different tabs: File, Script, Edit, Preprocessing, Metabonomic Analysis, and Spectrum. The Script tab provides access to the following functions: (a) "Load a Script," which opens a script into the input console, (b) "Save Script," which saves the commands written in the input console as an R script file, and (c) "Launch the Script," which runs the commands written in the input console. Other functions are described in detail in the following sections.
Data Importing
The NMR processed spectra for metabonomic analysis are loaded as a text file by selecting the "file/Load Data file" tab. The text file, with no header, shows the chemical shift (in ppm) in the first column, and the intensities of the different spectra are in the following columns. After importing the spectra text file, the GUI asks for an "info" file. This file contains all the sample information, which has been previously written by the user as a text file, where the first column holds the names of the samples and the different characteristics are in the following columns separated by tabs [Table 2]. A header with the caption of each column is also required.
Table 2. Example of an info file
Alternatively, the data can be loaded directly from the Bruker spectroscopy format by an independent package that can be executed by selecting the "file/Import Bruker file" tab. The user has to select the raw data (FID file in the Bruker data directory). This application displays the spectrum reference and manages basic operations such as setting the chemical shift of a certain compound (trimethylsilylpropionic acid or dimethylsilapentane sulfonic acid) to 0 ppm and zero order and first order phase corrections[9]. When the first set of data is loaded, the GUI asks for a new array. When all the spectra are imported, the GUI asks for the "info" file. Applications to load other commercial data formats will be added soon.
The GUI also allows processed data to be exported as a text file.
Category Selection
This application selects the information that will be used in the supervised analysis. First, the GUI asks which characteristic (different columns of the info file) will be used to classify the samples. The user then chooses the different types of samples that will be used in the multivariate analysis. To date, the program only allows the selection of four different sample types. The "Category Selection" application is launched by selecting the "file/Category Selection" tab.
Data PreProcessing
Data must be preprocessed carefully, since any inaccuracy introduced at this stage can cause significant errors in the multivariate analysis. Thus, the GUI offers several guided corrections, as explained below. If any special correction or data processing is necessary, it can be easily programmed in the input console.
Region Exclusions
The first step of data preprocessing usually involves the exclusion of spectral regions [10], which either contain nonreproducible information or do not contain information about metabolites. On the one hand, the spectral width to acquire NMR data is usually wider than necessary to digitize all chemical shifts associated with endogenous metabolites. Thus, downfield and upfield spectral areas without any endogenous metabolites are initially excluded. On the other hand, spectral regions highly depending on the experimental parameters, such as the water and the reference regions are also deleted. As these regions are sensitive to spectral artifacts, such as inadequate phasing, exclusion is beneficial. Therefore, the spectrum outside the 0.210ppm window is usually excluded. By selecting the "file/Manual Cut" tab, a graphical application to select the area of interest in the spectrum and to delete the water resonance region is launched.
Baseline Correction
Baseline correction is an essential step to obtain high quality NMR spectra in some cases [11,12]. Rolling baselines can make it difficult to identify peaks and can introduce significant errors into any quantitative measurements. In order to avoid errors, the GUI incorporates an application to reduce this influence in batch mode. Baseline correction is performed using the "bslnoff" function, which is based on the LOESS method [13] from the PROcess library [8]. This graphical application (Preprocessing/Baseline) allows the bandwidth to be controlled so that it can be passed to the LOESS function until the adjustment is correct. Graphs with the raw spectrum, estimated baseline, and baselinesubtracted spectrum are plotted in the R console.
Another application, based on the FTICRMS package [14], is available for individual baseline correction. It computes an estimated baseline curve for a spectrum using the method of Rocke and Xi [15]. The most important parameter for obtaining a perfect baseline is the smoothing parameter, which is controlled by a slider widget. The algorithm uses extra parameters that have been optimized for NMR data sets, such as negativity penalty, maximum number of iterations, or a parameter for robust center and scale estimation. In any case, these parameters can be modified through the "Extra Parameters" tab. All changes are instantly displayed in the graphical device [Figure 1], thus allowing an interactive baseline adjustment.
Figure 1. Baseline (FTICRMS) display. Baseline correction of a protonNMR spectrum using the Baseline (FTICRMS) display.
Binning
The most common method of reducing the influence of shifting peaks is the socalled binning or bucketing method, which reduces spectrum resolution [16]. Thus, the spectra are integrated within small spectral regions, called "bins" or "buckets". Subsequent data analysis procedures applied to the binned spectra are not influenced by peak shifts, as long as these shifts remain within the borders of the corresponding bins. After launching the binning graphical applications (Preprocessing/Binning), the user can select the bin size. This process is executed by the "binning" function from the PROcess library [8].
Peak detection and alignment
Peak alignment is an alternative to binning the spectrum to account for peak shifts [10,17,18]. A peak detection graphical application (Preprocessing/Peak Detection) has been developed to control the "msc.peaks.find" function from the caMassClass library [19]. The graphical application adjusts the signaltonoise ratio and the threshold criterion in the peak's detection process and returns a data frame with the positions and intensities of the detected peaks. These are aligned by a peak alignment graphical application (Preprocessing/Peak Alignment). This application guides the user in the use of the "msc.peaks.align" function from the caMassClass library [19].
Normalization
A crucial step in preprocessing of spectrum data in metabonomic studies is the socalled normalization step [10]. This step tries to account for possible variations in sample concentrations. Normalization may also be necessary for technical reasons. If spectra are recorded using a different number of scans or different devices, the absolute values of the spectra vary, and rendering a joint analysis of spectra without prior normalization is impossible. The normalization graphical application (Preprocessing/Normalization) makes it possible to choose between several types of normalization steps using functions from the clusterSim library [20].
Principal Components Analysis
Principal components analysis (PCA) is one of the most common exploratory steps in multivariate analysis [2123], and its most important use is to represent multivariate data in a lowdimensional space. The first principal component is the maximum variation direction in the cluster of points. The second principal component is the second largest variation, and so on.
The GUI incorporates a PCA graphical application (Metabonomic Analysis/PCA) to guide users in PCA by allowing the selection of the algorithm parameters. In addition, interactive graphics have been developed to change items such as the component and graphical parameters in the score [Figure 2] and loading plots. The principal components algorithm used is based on the "prcomp" function from the stats library [24].
Figure 2. Metabonomic GUI used for PCA. First and second principal component score plot of two class samples (control and tobacco).
In addition, a graphical display for outlier identification has been developed using the "prcomp" function and the "robustbase" package [25] (preprocessing/outliers). It shows Mahalanobis distances based on robust and classic estimates of the location and the covariance matrix in different plots.
Linear Discriminant Analysis
Linear discriminant analysis (LDA) is another common technique for the analysis of metabonomic data [21,26]. It is used to obtain linear discriminant functions, a linear combination of the original classes chosen to maximize the differences between them. For samples with only two classes, the discriminating function is a line, for three classes it is a plane, and for more than three classes a hyperplane. In the LDA graphical application (Metabonomic Analysis/LDA), the linear discriminant function is calculated by the "lda" function from the "MASS" package [27,28].
The program guides the user through the tasks in the proper order. First, an LDA model is built with part of the samples; the remainder are used to perform a validation test. The user can choose the samples directly to make the model, or randomly select the number of samples from each class. Second, the user can select the algorithm to calculate the LDA from among the following: "moment" for standard estimators of the mean and variance, "mle" for a maximum likelihood estimation, or "t" for robust estimates based on a t distribution. Finally, the LDA graphical application returns the results of the validation test and different interactive graphs of the LDA model [Figure 3]. If the number of different classes is three or less, the interactive graph is a plane where the samples used to build the model and the validation samples are plotted. If the number of different classes is greater than three, the samples used to build the model and the validation samples are plotted in interactive cubes. In these interactive plots, the user can select the angle of rotation, the components shown, and other graphical parameters.
Figure 3. Metabonomic GUI used for LDA. LDA score plot of two class samples (control and tobacco) with the training model samples (black) and the testing model samples (blue). The crossvalidation result is also returned.
Partial Least Squares Discriminant Analysis
Another common multivariate method [21,29,30] in metabonomic analysis is partial least squares discriminant analysis (PLSDA), a supervised linear regression method whereby the multivariate variables corresponding to the observations (spectral descriptors) are associated with the class membership for each sample [31]. PLSDA provides an easily understandable graphical approach to identifying the spectral regions of difference between the classes, and allows a statistical evaluation of whether the differences between classes are significant.
Two different PLSDAs have been included in the "Metabonomic" GUI. The first PLS graphical application (Metabonomic Analysis/Partial Least Squares/PLS) was developed with a PLS algorithm based on the extension of the generalized partial least squares model proposed by Ding and Gentleman [32]. This algorithm is implemented using the "gpls" function from the "gpls" package [33], and it allows separation between no more than two classes of samples. The graphical application controls the manual or random selection of the samples to build the model, the selection of all the algorithm parameters such as the tolerance to the convergence, the number of iterations allowed, and the number of PLS components used. At the end, the results of the validation test are returned.
The second application (Metabonomic Analysis/Partial Least Squares/PLS with graphics) is performed using the "plsr" function from the "pls" package [34,35]. This PLSDA is more complex, and the application guides the user through all the steps in the proper order. First, the user chooses between manual and random selection of the samples. Second, the user selects the PLS algorithm and the validation method. The four PLSR algorithms available are the kernel algorithm [36], the wide kernel algorithm [37], the SIMPLS algorithm [38], and the classic orthogonal scores algorithm [39].
Next, the application creates a PLS model with the maximum number of components and shows the explained variance and the R^{2 }graphics of the model. With this information, the user can select the optimum number of PLS components to build the model. In addition, the standard error of prediction (SEP) and the root mean standard error of prediction (RMSEP) are plotted in the R console.
Finally, the PLS graphical application returns the results of the validation test and different interactive graphs of the PLS model [Figure 4].
Figure 4. Metabonomic GUI used for PLSDA. Interactive graph (right) with the first three PLS components score plot of two classes of samples (healthy and tobacco). The black samples are the samples used to build the model and the blue samples are the validation samples. In addition, the validation result and the explained variance are shown.
KNearest Neighbors Classification
The knearest neighbors (KNN) rule for classification [40] is the simplest of all supervised classification approaches. For the classification of an unknown object, its distance (usually the Euclidian distance) to all other objects is computed. The minimum distance is selected and the object is assigned to the corresponding class. The KNN graphical interface (Metabonomic Analysis/KNN) allows the user to choose between random or manual selection of the samples to build the model, number of neighbors, minimum vote for definite decision, and the use or not of all the neighbors. If the all the neighbors are used, all distances equal to the kth largest are included. If not, a random selection of distances equal to the kth is chosen to use exactly k neighbors. To finish, the interface returns the results of the validation test and the crossvalidation test. The KNN graphical application uses the "knn" function from the class package [28].
Neural Networks
Application of artificial neural networks (ANNs) for data processing is characterized by analogy with a biological neuron. An ANN consists of a layered network of nodes, each of which performs a simple operation on several inputs to produce a single output.
Two different applications to define ANNs have been included in the "Metabonomic" GUI. The first application (Metabonomic Analysis/Neural Network/Neural Network [Single hidden layer]) makes use of the "nnet" function from the "nnet" R package [28]. This graphical application allows the user to build a singlehiddenlayer neural network, by selecting the number of units in the hidden layer, the initial random weight, and the weight decay. In addition, the user can choose between random or manual selection of the training samples.
The second application (Metabonomic Analysis/Neural Network/Neural Network [multiple hidden layers]) creates a feedforward artificial neural network according to the structure established by the "AMORE" package [41]. With this application, the user can select the number of layers and the number of neurons in each layer, while controlling several parameters. These include the learning rate at which every neuron is trained, the momentum for every neuron, the error criterion (least mean squares or least mean logarithm squares), the activation function of the hidden and the output layer (Purelin, Tansig, Sigmoid, or Hardlim), and the training method (Adaptive gradient descent or BATCH gradient descent, with or without momentum). With these parameters selected, the algorithm trains the network with the manually or randomly selected samples before testing it with the rest of the samples.
Other Tools
In addition to the multivariate techniques, other useful graphical tools have been developed in the "Metabonomic" GUI to enable easy interpretation of complex data tables.
For example, a graphical display (Metabonomic/Chemical Shift Region Display) has been added to show the differences between the subgroups in a specific spectral region. The application plots the values and means of all samples in the specified chemical shift region [Figure 5].
Figure 5. Extra tools. Metabonomic GUI tools to visualize and overlap the spectra (right) and to show the values of all samples in a given chemical position (left).
Another graphical display (Spectrum/...) has been created to visualize and overlap the spectra. With these applications, the user can focus the interesting areas with a zoom tool, superimpose different spectra, increase or decrease the spectra intensity, and change other graphical parameters. Moreover, when the user clicks with the cross cursor in the spectrum, a new window pops up showing the chemical shift and the intensity of this selected resonance. This display can be launched for the original or for the current spectra [Figure 5].
Results
An NMR analysis of lung tissue was used to test our package. This dataset (unpublished data) consisted of 28 AKR/J mice chronically exposed to tobacco smoke for 5 days/week (n = 15) over a 6month period and a sham group (n = 12).
Highresolution magic angle spinning spectra were generated from intact lung tissue using a BRUKER AMX500 spectrometer 11.7 T, 500.13 MHz (256 scans collected for each sample, 16K data points).
First, the water peak and the spectrum area outside the 0.210ppm window were removed. The baseline of each spectrum was corrected using the Baseline (FTICRMS) tool. In addition, the spectra were normalized by total area and integrated within 0.04ppm buckets.
The preprocessed spectra underwent different multivariate analyses. The multivariate models were built with a number of random training samples (8 samples of each type). The remaining samples can be used to perform a validation test, derived from the probability of belonging to each group. The validation results are summarized in Table 3.
Table 3. Validation results for different multivariate methods incorporated in the GUI
Conclusion
Preprocessing of raw NMR spectra and different multivariate analyses are standard procedures applied to interpret the complex metabonomic profile. The "Metabonomic" GUI presented in this paper offer an easy application of the principal preprocessing methods and the most commonly used multivariate statistical methods in metabonomic analysis. Various tools have been developed or adapted to make statistical analysis easier for the inexperienced user. The more experienced user always maintains complete control of the statistical tools. Special correction or data processing can be carried out using the input console.
The main advantage of the "Metabonomic" GUI is its modular design, which makes it easy to upgrade. Furthermore, new analysis methods can be included in the metabonomic field using the large R free software library.
Availability and requirements
• Project name: Metabonomic R package.
• Project home page: http://cran.rproject.org webcite
• Operating system: MS Windows.
• Programming language: R. The package runs on MS Windows using an installed version of R.
• Other requirements: The required PROcess package is available in the Bioconductor website http://bioconductor.org webcite.
• Licence: GPL version 2 or newer.
List of abbreviations
ANN: artificial neural network; GUI: graphical user interface; KNN: knearest neighbors; LDA: linear discriminant analysis; PCA: principal components analysis; PLS: partial least squares; PLSDA: partial least squares discriminant analysis; NMR: nuclear magnetic resonance; GUI: graphical user interface.
Authors' contributions
JLIG carried out the programming and software design and drafted the manuscript. PV, IR, AK, PB, and MD provided domain knowledge and helped to draft the manuscript. JRC conceived the study, participated in its design and coordination, and helped to draft the manuscript. All authors have read and approved the final manuscript.
Acknowledgements
This research was supported by the Spanish MICINN (SAF200805412) and the Comunidad de Madrid (S505AGR187).
References

Nicholson J, Holmes E, Lindon J: Metabonomic and Metabolomics Techniques and Their Applications in Mammalian Systems. In The Handbook of Metabonomics and Metabolomics. Edited by Lindon JC, Nicholson JK, Holmes E. Amsterdam, ELSEVIER; 2007:134.

Chatfield C, Collins AJ: Introduction to Multivariate Analysis. London, Chaoman and Hall; 1980.

Turkey JW: Exploratory Data Analysis. AddisonWesley, Reading; 1977.

Wang T, Shao K, Chu Q, Ren Y, Mu Y, Qu L, He J, Jin C, Xia B: Automics: an integrated platform for NMRbased metabonomics spectral processing and data analysis.
BMC Bioinformatics 2009, 10(1):83. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

The R foundation for Statistical Computing [http://www.rproject.org/] webcite

R Development Core Team: The "tcltk" library. [http://finzi.psych.upenn.edu/R/library/tcltk/html/00Index.html] webcite

Xiaochun Li: PROcess: Ciphergen SELDITOF Processing. [http://www.bioconductor.org/packages/release/bioc/html/PROcess.html] webcite
R package version 0.160. Bioconductor, Open Source Software for Bioinformatics

De Graaf RA: Basic Principles. In In vivo NMR Spectroscopy. 2nd edition. Chichester, West Sussex, England; Hoboken, NJ: John Wiley & Sons; 2007:1418.

Ross A, Schlotterbeck G, Dieterle F, Senn H: NMR Spectroscopy Techniques. In The Handbook of Metabonomics and Metabolomics. Edited by Lindon JC, Nicholson JK, Holmes E. Amsterdam, ELSEVIER; 2007:96112.

Golotvin S, Williams A: Improved Baseline Recognition and Modeling of FT NMR Spectra.
Journal of Magnetic Resonance 2000, 146(1):122125. PubMed Abstract  Publisher Full Text

Cobas JC, Bernstein MA, MartinPastor M, Tahoces PG: A new generalpurpose fully automatic baselinecorrection procedure for 1D and 2D NMR data.
J Magn Reson 2006, 183(1):145151. PubMed Abstract  Publisher Full Text

Cleveland WS, Devlin SJ: Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting.
Journal of the American Statistical Association 1988, 83:596610.

Barkauskas Don: FTICRMS: Programs for Analyzing Fourier TransformIon Cyclotron Resonance Mass Spectrometry Data. [http://cran.rproject.org/web/packages/FTICRMS/index.html] webcite

Xi Y, Rocke DM: Baseline Correction for NMR Spectroscopic Metabolomics Data Analysis.
BMC Bioinformatics 2008, 9:324. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Holmes E, Foxall PJD, Nicholson JK: Automatic data reduction and pattern recognition methods for analysis of 1H nuclear magnetic resonance spectra of human urine from normal and pathological states.

Forshed J, SchuppeKoistinen I, Jacobsson SP: Peak alignment of NMR signals by means of a genetic algorithm.

Kemsley EK, Le Gall Gnl, Dainty JR, Watson AD, Harvey LJ, Tapp HS, Colquhoun IJ: Multivariate techniques and their application in nutrition: a metabolomics case study.
British Journal of Nutrition 2007, 98(01):114. PubMed Abstract  Publisher Full Text

Tuszynski J: caMassClass: Processing & Classification of Protein Mass Spectra (SELDI) Data. [http://finzi.psych.upenn.edu/R/library/caMassClass/html/00Index.html] webcite

Walesiak M, Dudek A: clusterSim: Searching for optimal clustering procedure for a data set. [http:/ / finzi.psych.upenn.edu/ R/ library/ clusterSim/ html/ data.Normalization.html] webcite

Lindon JC, Holmes E, Nicholson JK: Pattern recognition methods and applications in biomedical magnetic resonance.
Progress in Nuclear Magnetic Resonance Spectroscopy 2000, 39:140.

Eriksson L Johahansson E, KettanehWold N, Wold S: Multi and Megavariate Data Analysis. Principles and Applications.
Umetrics AB 2001.
ISBN 919737301X

R Development Core Team and contributors worldwide: Stats R package. [http://finzi.psych.upenn.edu/R/library/stats/html/prcomp.html] webcite

Filzmoser P, Todorov V, Maechler M: Robustbase: Basic Robust Statistics. [http://finzi.psych.upenn.edu/R/library/robustbase/html/00Index.html] webcite

Hewer R, Vorster J, Steffens FE, Meyer D: Applying biofluid 1H NMRbased metabonomic techniques to distinguish between HIV1 positive/AIDS patients on antiretroviral treatment and HIV1 negative individuals.
Journal of Pharmaceutical and Biomedical Analysis 2006, 41(4):14421446.

Venables WN, Ripley BD: Modern applied statistics with S. 4th edition. New York, Springer; 2002.

Venables W, Ripley B, Hornik K, Gebhardt A: Bundle of MASS, class, nnet, spatial. [http://cran.rproject.org/web/packages/VR/index.html] webcite

Bollard ME, Stanley EG, Lindon JC, et al.: NMRbased metabonomic approaches for evaluating physiological influences on biofluid composition.
NMR in Biomedicine 2005, 18(3):143162. PubMed Abstract  Publisher Full Text

Gavaghan CL, Holmes E, Lenz E, et al.: An NMRbased metabonomic approach to investigate the biochemical consequences of genetic strain differences: application to the C57BL10J and Alpk:ApfCD mouse.
FEBS Letters 2000, 484(3):169174. PubMed Abstract  Publisher Full Text

Otto M: Chemometrics. Statistics and Computer Application in Analytical Chemistry. New York, WileyVCH; 1999.

Ding B, Gentleman R: Classification using penalized partial least squares.

Ding B, Gentleman R: pls: Classification using generalized partial least squares. [http://finzi.psych.upenn.edu/R/library/gpls/html/gpls.html] webcite

Wehrens R, Mevik B: The pls Package: Principal Component and Partial Least Squares Regression in R.

Wehrens R, Mevik B: PLS: Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR). [http://finzi.psych.upenn.edu/R/library/pls/html/00Index.html] webcite

Rännar S, Lindgren F, Geladi P, Wold S: A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm.

de Jong S: SIMPLS: An alternative approach to partial least squares regression.
Chemometrics and Intelligent Laboratory Systems 1993, 18(3):251263.

Martens H, Næs T: Multivariate calibration. Chichester [England]; New York, Wiley; 1989.

Fix E, Hodges JL: Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties.

Castejón M, Ordieres J, González A: AMORE: A MORE Flexible Neural Network Package. [http://finzi.psych.upenn.edu/R/library/AMORE/html/00Index.html] webcite