Centre for Systems Biology, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
The Henry Wellcome Building for Biomolecular NMR Spectroscopy, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
School of Biosciences, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
Abstract
Background
Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1dimensional (1D) ^{1}H, projections of 2D ^{1}H, ^{1}H Jresolved (pJRES), and intact 2D Jresolved (JRES).
Results
Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra.
Conclusion
We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra.
Background
Metabolomics relies extensively upon the multivariate analysis of data
Data processing methods can be used to affect the structure of the variance of experimental data sets, helping to focus the subsequent multivariate analysis onto more biologically relevant information arising from the biological variance
Autoscaling is a processing technique in which the variance of each variable is scaled to unity and the mean of each variable is set to zero
The glog is a transformation that was originally applied to microarray data
Here, we first aimed to evaluate comprehensively the glog transform compared to two other commonly used scaling methods in NMR metabolomics as well as against unscaled data. This evaluation was conducted using three disparate data sets to confirm the broad applicability of the approach, including: urine samples to discriminate between two dog breeds, muscle tissue extracts to discriminate between hypoxia and normoxia in marine mussels, and liver tissue extracts to discriminate between fish collected from two different rivers. The performances of each of the scaling methods – autoscaling, Pareto and glog – were assessed by conducting PCA of each of the processed twoclass data sets. This was achieved by calculating the sensitivities and specificities derived from applying linear discriminant analysis (LDA) to each of the resulting PCA scores plots. The effect of each scaling method upon the ability to discover potential metabolic biomarkers was also investigated. This was accomplished by selecting the largest peaks in the PCA loadings plots and then evaluating if the corresponding peaks in the NMR spectra were of significantly different intensity between the biological classes. Secondly, we aimed to evaluate the applicability of the glog transform for 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. This enabled the first NMR metabolomics study of intact 2D JRES spectra; including the reconstruction of the PCA loadings plot to a 2D format analogous to the JRES spectra, which is anticipated to have significant benefit in terms of the ease of metabolite identification. During this second aim we also sought to extend the glog transform to reduce the deleterious effects of noise.
Results and Discussion
For each data set under consideration, the data has been normalised and binned prior to any scaling or transformation techniques having been conducted. For ease of reference the normalised, binned spectra are referred to as "unscaled" data, and autoscaling, Pareto scaling and the glog transformation are all referred to as "scaling" methods. Furthermore, as described in the Methods section, the glog transform must initially be calibrated once for each type of biological sample. The resulting calibration parameter (Table
Parameter values for all glog transformations.
Data type
Transform
λ value
1D NMR, mussel muscle
glog
2.0025 × 10^{8}

extended glog
1.2689 × 10^{8}
8.7026 × 10^{5}
pJRES NMR, dog urine
glog
2.3024 × 10^{9}

extended glog
1.5175 × 10^{9}
4.9506 × 10^{5}
2D JRES NMR, fish liver
glog
6.9974 × 10^{14}

extended glog
4.0877 × 10^{12}
1.575 × 10^{6}
Data Structure
Figure
Bin variance versus bin intensity of the technical replicates for (A) 1D mussel data; (B) pJRES dog data; (C) JRES fish data. Some low intensity bins (predominantly noise) can be seen to the left of the plots which exhibit similar variance levels; however a more linear relationship can be seen in the medium and high intensity bins.
Click here for file
Representative NMR spectra prior to the application of any scaling
Representative NMR spectra prior to the application of any scaling. (A) 1D NMR spectrum of mussel adductor muscle, (B) pJRES NMR spectrum of canine urine, (C) intact 2D JRES NMR spectrum of a fish liver.
Figure
Prior to assessing the effects of scaling on variance, it is important to contrast the technical versus biological variability in the datasets. This can be achieved by calculating the median and range of the coefficients of variation (CV) for all the bins across a series of NMR spectra. Technical variability is measured by the CV of the technical replicates, and biological variability (which also includes technical variability) by the CV of the biological dataset. For the mussel 1D NMR data, the median CV of the technical replicates is 6.5% (range of 0.4–30.6%). In contrast, the median CV of the mussel biological data is 22.6% (range of 7.2–128.4%). Clearly the technical variance is a significant proportion of the biological variance, and therefore must be treated appropriately prior to multivariate analysis. Similar results are found for the two other data sets: the dog pJRES NMR data has median CVs of 13.4% (technical) and 52.1% (biological) with ranges of 0.6–70.4% and 14.6–272.1%, respectively. And for the fish 2D JRES NMR data the median CVs are 23.0% (technical) and 48.4% (biological) with ranges of 1.5–88.2% and 13.6–228.5%, respectively.
Effects of Scaling on Variance
Scaling techniques are applied after the other processing steps, such as normalisation and binning, have been completed and can radically change the appearance and structure of the spectra of all the different NMR data sets. For example, the canine urine data set is shown in Figure
Effects of scaling on the appearance of a pJRES canine urine NMR spectrum
Effects of scaling on the appearance of a pJRES canine urine NMR spectrum. (A) Unscaled spectrum, (B) autoscaled spectrum, (C) Pareto scaled spectrum, (D) glog transformed spectrum. The region between 4.50–6.45 ppm contained the urea and residual water peaks and was therefore excluded.
The appearance of the spectra is only one indication of the structure of the processed data. For more information specifically relating to the ability of the scaling techniques to minimise technical variance, it is more useful to examine the variance exhibited by the bins across the spectra of technical replicates. Figure
Effects of scaling on the variance of the six technical replicates of the pooled invertebrate muscle sample
Effects of scaling on the variance of the six technical replicates of the pooled invertebrate muscle sample. Each plot shows the variance of every bin versus the bin number, where the bin numbers have been ranked according to their mean intensities; i.e. the highest intensity bins appear on the right of each plot. Bin variances are shown for (A) unscaled data, where the insert shows a zoomed in section, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data.
Scaling methods can radically change the variance structure of the data set. Figure
Effects of Scaling on Classification Accuracy
PCA was employed to provide an unbiased method to evaluate the usefulness of the scaling techniques, since this provides a clear strategy to observe the effects of the scaling on the variance of the data. However, all the transformations are equally applicable as a processing step prior to supervised multivariate analysis such as PLSDA. Also, to provide a quantitative method to evaluate the models, LDA was then applied to the first and second PCs. The solid black line in Figures
PCA scores plots of the 1D NMR spectra of mussel adductor muscle
PCA scores plots of the 1D NMR spectra of mussel adductor muscle. (A) Unscaled data, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data. The red circles represent the hypoxic samples whilst the blue squares represent the normoxic samples. The black line represents the decision boundary between the classes constructed using LDA.
PCA scores plots of the pJRES NMR spectra of canine urine
PCA scores plots of the pJRES NMR spectra of canine urine. (A) Unscaled data, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data. The red circles represent the samples from Labradors, with the blue squares representing the Miniature Schnauzer samples. The black line represents the decision boundary constructed using LDA.
PCA scores plots of the intact 2D JRES NMR spectra of fish liver
PCA scores plots of the intact 2D JRES NMR spectra of fish liver. (A) Unscaled data, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data. The red circles represent fish sampled from the River Alde and the blue squares represent fish from the River Tyne. The black line represents the decision boundary between the classes constructed using LDA.
PCA of extended glog transformed 2D JRES NMR spectra of fish liver
PCA of extended glog transformed 2D JRES NMR spectra of fish liver. (A) Scores plot where red circles represent the fish sampled from the River Alde and the blue squares represent fish from the River Tyne. The black line represents the decision boundary between the classes constructed using LDA. (B) Aerial view of the corresponding PC1 loadings plot presented in the format of a 2D JRES spectrum, with Jcouplings along one axis to facilitate metabolite identification. (C) Side view of the same loadings plot as in B, highlighting the metabolites that are at higher concentration (red) in fish liver collected from the River Alde.
Classification statistics for each PCA model constructed.
Data type
Scaling
Sensitivity
Specificity
Correctly classified
Crossvalidation accuracy
1D NMR, mussel muscle
unscaled
0.333
0.800
16 of 27
37.04%
autoscaled
0.083
0.933
15 of 27
33.33%
Pareto
0.500
0.733
17 of 27
51.85%
glog
1.000
1.000
27 of 27
100.00%
extended glog
1.000
0.86667
25 of 27
92.60%
pJRES NMR, dog urine
unscaled
0.294
0.750
20 of 37
32.43%
autoscaled
0.824
0.850
31 of 37
83.78%
Pareto
0.530
0.700
23 of 37
56.76%
glog
0.824
0.850
31 of 37
83.78%
extended glog
0.824
0.850
31 of 37
83.78%
2D JRES NMR, fish liver
unscaled
1.000
0.550
29 of 38
68.42%
autoscaled
0.944
0.800
33 of 38
63.16%
Pareto
0.944
0.800
33 of 38
86.84%
glog
0.889
0.850
33 of 38
86.84%
extended glog
1.000
1.000
38 of 38
100.00%
Mussel adductor muscle samples
Figure
Figure
PCA loadings plots of the 1D NMR spectra of mussel adductor muscle
PCA loadings plots of the 1D NMR spectra of mussel adductor muscle. (A) Unscaled data, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data. The plots represent the loadings perpendicular to the decision line calculated by using LDA on each of the scaled data sets. The 5 largest bins in each plot have each been tested as potential biomarkers to discriminate between the two classes. Key: (solid circle) bin is not significantly different; (*) p < 0.05; (**) p < 0.01; (***) p < 0.001.
The 5 largest bins in each loadings plot have been tested as potential biomarkers using oneway ANOVAs. Clearly, as shown in Figures
Canine urine samples
For the pJRES NMR data set of urine samples from two breeds of dog, the processing methods show a similar effect upon the data (Figure
For the unscaled and Pareto scaled data that produced the lowest classification accuracies, the loadings plots for the PC perpendicular to the LDA decision line (Figures
PCA loadings plots of the pJRES NMR spectra of canine urine
PCA loadings plots of the pJRES NMR spectra of canine urine. (A) Unscaled data, (B) autoscaled data, (C) Pareto scaled data, (D) glog transformed data. The plots represent the loadings perpendicular to the decision line calculated by using LDA on each of the scaled data sets. The 5 largest bins in each plot have each been tested as potential biomarkers to discriminate between the two classes. Key: (solid circle) bin is not significantly different; (*) p < 0.05; (**) p < 0.01; (***) p < 0.001.
Fish liver samples
The PCA scores plots from the analysis of the intact 2D JRES NMR data are shown in Figure
Concatenated 2D JRES NMR spectra of fish liver
Concatenated 2D JRES NMR spectra of fish liver. (A) Spectra following the standard glog transformation. (B) Spectra after the extended glog transformation has been applied. Transformation parameters are listed in Table 1 and the red line indicates three times the standard deviation of the noise, regarded as the largest noise peaks.
An algorithm to increase the relative signal to noise ratio of the data was then investigated by extending the glog transformation to include an additional parameter, as shown in equation (4). Figure
The corresponding PC1 loadings plot for the scores plot in Figure
Conclusion
We have demonstrated that autoscaling, Pareto scaling and the glog and extended glog transformations can significantly alter the variance structure of NMR metabolomics data, which in turn can improve the classification accuracy of multivariate models generated from the scaled data. This can help to extract important information from data sets, since improving the discrimination between sample classes can help to identify metabolic biomarkers. Specifically, we have demonstrated that the glog and extended glog transformations achieve the best, or equal best, classification accuracy compared to unscaled, autoscaled and Pareto scaled data on three example data sets. A classification accuracy of 100% was achieved for two data sets – the effect of hypoxia in invertebrate muscle extracts and the effect of sampling location on fish liver extracts – and an accuracy of 31 of 37 correctly classified for a third dataset examining breed discrimination using dog urine. Furthermore, from an analysis of the top five peaks in each of the corresponding PCA loadings plots, we have confirmed that glog transformed data is considerably better at discovering metabolic biomarkers that can discriminate significantly between sample classes. We have also confirmed the broad applicability of the glog approach using three disparate data sets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. Finally, we have reported an extension to the original glog algorithm that effectively suppresses the noise, which was critical for the analysis of intact 2D JRES spectra. In conclusion, we have thoroughly evaluated and proven the benefits of utilising the glog transformation for stabilising the technical variance associated with metabolomics experiments, which can lead to significantly beneficial effects on the discrimination between sample classes using multivariate analysis.
Methods
Three data sets were used to highlight the broad applicability of the generalised log transformation across multiple biological species and sample types. The three data sets comprised spectra of mammalian (canine) urine, extracts of marine mussel adductor muscle, and extracts of fish liver. The preparation, NMR analysis and processing of each is described below.
Sample Preparation and Collection of NMR Spectra
Canine urine
Freecatch urine samples were collected over several days from two breeds of dog (17 samples from three male Labradors and 20 samples from four male Miniature Schnauzers), frozen at 80°C, and subsequently prepared and analysed using the methods described elsewhere
Mussel adductor muscle
Muscle tissues were dissected from two groups of Mediterranean mussels (
Fish liver
European flounder (
Technical Replicates
It should be noted that for all data sets, the technical replicates form an integral part of calibrating the glog transformation. A minimum of five or six replicates should be generated from a single homogenous pool of the relevant biological material for each data set. Ideally, this pool of biological material is formed by mixing several smaller amounts of different samples from all experimental classes (e.g., control and stressed).
Data Processing
The 1D, pJRES and 2D JRES NMR spectra were converted to an appropriate format for multivariate analysis using customwritten
Scaling Methods
After each data set was binned, normalised and bin compressed – and for the intact 2D JRES spectra, concatenated – the following scaling techniques were applied:
Autoscaling
The variance of each bin was scaled to unity by dividing the intensity of each bin by the standard deviation of that bin; note that mean centring was not applied yet.
Pareto scaling
The intensity of each bin was divided by the square root of the standard deviation of that bin; again, mean centring was not applied at this point.
Glog transformation
The glog transformation is given in equation (1), where
Plot of the generalised logarithm and extended generalised logarithm functions
Plot of the generalised logarithm and extended generalised logarithm functions. The glog was plotted using a λ value of 1 × 10^{12 }(solid blue line) and the extended glog was plotted using a λ value of 1 × 10^{13 }and a
In order to calculate the transform that minimises the technical variance, λ is calibrated using technical replicates generated from a single pooled biological sample. The replicate spectra are processed in exactly the same manner as the biological data set, i.e. normalisation, compression regions etc, to ensure all technical variance is accounted for when calibrating the glog parameters. The calibration was achieved using a maximum likelihood method proposed by Rocke and Durbin
The parameter λ was optimised by minimising the variance, S, (3) over
Here,
The optimisation of λ is achieved via the NelderMead unconstrained nonlinear minimization routine in the MATLAB optimisation toolbox. The optimised λ value was then used to transform the binned intensities of each spectrum in the full biological data set. The MATLAB code developed here is included as additional file
The extended glog is given in equation (4) where an extra transformation parameter
The parameter
Since
For both calibration methods described here, the minimisation routine was terminated when the absolute change in λ was less than a predetermined value (here 1 × 10^{16}) or a maximum number of iterations was completed (here 1 × 10^{3}). Table
Analysis of Models
Each unscaled or scaled data set was then mean centred and PCA performed using PLS_Toolbox (Eigenvector Research, Inc., Wenatchee, WA, USA). Next, using the Discriminant Analysis Toolbox (Michael Kiefte, Dalhousie University, Canada
List of abbreviations used
NMR: nuclear magnetic resonance
PCA: principal component analysis
PLSDA: partial least squares discriminant analysis
PC: principal component
LDA: linear discriminant analysis
glog: generalised logarithm transformation
1D: one dimensional
2D: two dimensional
JRES spectrum: 2D Jresolved NMR spectrum
pJRES: 1D skyline projection of a 2D JRES spectrum
ANOVA: analysis of variance
CV: coefficient of variance
Authors' contributions
HMP wrote the code implementing the methodology and completed the comparisons of the different scaling methods. MRV conceived of the study, participated in its completion and helped to draft the manuscript. CL and ULG conceived and tested the extended glog transformation. All authors read and approved the final manuscript.
Optimisation code.
Click here for file
Acknowledgements
HMP thanks the EPSRC and NERC for a Directed PhD studentship and MRV thanks the NERC for an Advanced Fellowship (NER/J/S/2002/00618). This work was partly supported by the NERC Post Genomic and Proteomic (PGP) Directed Program (NE/C507661/1). The authors gratefully acknowledge Prof. David Rocke, Prof. David Woodruff, Yuanxin Xi (University of California, Davis) and John Easton (Birmingham) for assistance with the MATLAB code, as well as Dr. Dov Stekel (Birmingham) for advice on the linear discriminant analyses. We also thank several people for supplying samples and/or NMR data, including Dr. David Allaway (Waltham Centre for Pet Nutrition) for the canine urine samples, Dr. Stephen George (University of Stirling) for the flounder liver samples, and Dr. Huifeng Wu and Adam Hines (Birmingham) for many of the NMR spectra.