Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F: Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol 2002,3:RESEARCH0034. Publisher full text
Gene-expression analysis is increasingly important in many fields of biological research. Understanding patterns of expressed genes is expected to provide insight into complex regulatory networks and will most probably lead to the identification of genes relevant to new biological processes, or implicated in disease. Two recently developed methods to measure transcript abundance have gained much popularity and are frequently applied. Microarrays allow the parallel analysis of thousands of genes in two differentially labeled RNA populations , while real-time RT-PCR provides the simultaneous measurement of gene expression in many different samples for a limited number of genes, and is especially suitable when only a small number of cells are available [2,3,4]. Both techniques have the advantage of speed, throughput and a high degree of potential automation compared to conventional quantification methods, such as northern-blot analysis, ribonuclease protection assay, or competitive RT-PCR. Nevertheless, these new approaches require the same kind of normalization as the traditional methods of mRNA quantification.
Several variables need to be controlled for in gene-expression analysis, such as the amount of starting material, enzymatic efficiencies, and differences between tissues or cells in overall transcriptional activity. Various strategies have been applied to normalize these variations. Under controlled conditions of reproducible extraction of good-quality RNA, the gene transcript number is ideally standardized to the number of cells, but accurate enumeration of cells is often precluded, for example when starting with solid tissue. Another frequently applied normalization scalar is the RNA mass quantity, especially in northern blot analysis. There are several arguments against the use of mass quantity. The quality of RNA and related efficiency of the enzymatic reactions are not taken into account. Moreover, in some instances it is impossible to quantify this parameter, for example, when only minimal amounts of RNA are available from microdissected tissues. Probably the strongest argument against the use of total RNA mass for normalization is the fact that it consists predominantly of rRNA molecules, and is not always representative of the mRNA fraction. This was recently evidenced by a significant imbalance between rRNA and mRNA content in approximately 7.5% of mammary adenocarcinomas . Also, it has been reported that rRNA transcription is affected by biological factors and drugs [6,7,8]. Further drawbacks to the use of 18S or 28S rRNA molecules as standards are their absence in purified mRNA samples, and their high abundance compared to target mRNA transcripts. The latter makes it difficult to accurately subtract the baseline value in real-time RT-PCR data analysis.
Statement of the problem
To date, internal control genes are most frequently used to normalize the mRNA fraction. This internal control - often referred to as a housekeeping gene - should not vary in the tissues or cells under investigation, or in response to experimental treatment. However, many studies make use of these constitutively expressed control genes without proper validation of their presumed stability of expression. But the literature shows that housekeeping gene expression - although occasionally constant in a given cell type or experimental condition - can vary considerably (reviewed in [9,10,11,12]). With the increased sensitivity, reproducibility and large dynamic range of real-time RT-PCR methods, the requirements for a proper internal control gene have become increasingly stringent.
Purpose and what was done
In this study, we carried out an extensive evaluation of 10 commonly used housekeeping genes in 13 different human tissues, and outlined a procedure for calculating a normalization factor based on multiple control genes for more accurate and reliable normalization of gene-expression data. Furthermore, this normalization factor was validated in a comparative study with frequently applied microarray scaling factors using publicly available microarray data.