Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Correction

Correction: Three-parameter lognormal distribution ubiquitously found in cDNA microarray data and its application to parametric data treatment

Tomokazu Konishi

Author Affiliations

Faculty of Bioresource Sciences, Akita Prefectural University, Akita 010-0195, Japan

BMC Bioinformatics 2004, 5:82  doi:10.1186/1471-2105-5-82


The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2105/5/82


Received:28 June 2004
Accepted:28 June 2004
Published:28 June 2004

© 2004 Konishi; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.

Correction to formulae in methods section [1]

The lognormal distribution model and estimation of the parameters

The method assumes that the original intensity data, (ri) for i = 1,2...n, obey a lognormal distribution. The probability density function of the intensity data used was:

f(ri) = [k/{(2π)1/2 σ(ri - γ)}] exp [-{log(ri - γ) - μ}2/2σ2] for ri > γ,

where k is a compensation constant (k = loge = 0.4343), σ and μ are the shape and scale parameters for log(ri - γ), respectively.

The threshold parameter, γ, was found through trial and improvement calculation processes; in the trial, the distribution of log(ri - γ) was checked by normal probability plotting, and the value that gave the best fit to the model was selected for γ. The fitness was evaluated by the sum of absolute differences between the model and log(ri - γ), within the interquartile range of data. The parameter μ was found as the median of log(ri - γ), and the parameter σ was found from the interquartile range of log(ri - γ); these are known as robust alternatives for the arithmetic mean and standard deviation, respectively. Parameters μ and σ were found for each data grid, a group of data for DNA spots that were printed by an identical pin in order to avoid divergences caused by pin-based differences. Z-normalization was carried out for each datum as

Zri = {log(ri - γ) - μ}/σ.

Intensity data (ri) less than γ were treated as "data not detected", since such data might contain negative noise larger than the signal (see Results).

References

  1. Tomokazu Konishi : Three-parameter lognormal distribution ubiquitously found in cDNA microarray data and its application to parametric data treatment.

    BMC Bioinformatics 2004, 5:5. PubMed Abstract | BioMed Central Full Text OpenURL