Classifying the fingerprint of an NMR spectrum is a crucial step in many metabolomics experiments. Since many classification techniques such as principal component analysis (PCA) depend upon variance discrepancies, it is important to first maximise any contribution from wanted class variance between biological samples and minimise any contribution from unwanted technical variance arising from the preparation of the samples and measurement of the NMR metabolic fingerprints. The generalised logarithm (glog) transform was developed to stabilise the variance between technical replicates in a two component error model  and has also been applied to NMR spectra previously . To increase the effectiveness of the transform on NMR spectra, the glog was extended to include a baseline offset term. This decreases the unwanted noise contribution on the transformed spectra. The extended glog transformation is given as:
for z the transformed intensity and y the original intensity of the spectra. y0 and λ are transformation parameters which are to be found.
Here we have applied the extended glog transform to technical replicates of NMR spectra of tissue extracts from marine mussels, to determine the optimised transformation parameters λ and y0. Next we applied the optimised transformation to a data set comprised of two classes of NMR spectra from stressed and unstressed mussels. Following transformation, the results show significantly better separation of the classes on a PCA scores plot than can be achieved with both untransformed data and also data transformed using Pareto scaling, a widely used method in NMR metabolomics . In conclusion, we have demonstrated the value of the extended glog transformation to stabilise the technical variance in an NMR metabolomics dataset and have achieved significantly improved classification of NMR fingerprints from stressed and unstressed animals.
Technometrics 1995, 37(2):176-184. Publisher Full Text
Analytica Chimica Acta 2003, 490(1):265-276. Publisher Full Text