A non-parametric meta-analysis approach for combining independent microarray datasets: application using two microarray datasets pertaining to chronic allograft nephropathy
1 Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA
2 Department of Surgery, Virginia Commonwealth University, Richmond, VA 23298, USA
3 Massey Cancer Center, Virginia Commonwealth University, Richmond, VA 23298, USA
BMC Genomics 2008, 9:98 doi:10.1186/1471-2164-9-98Published: 26 February 2008
With the popularity of DNA microarray technology, multiple groups of researchers have studied the gene expression of similar biological conditions. Different methods have been developed to integrate the results from various microarray studies, though most of them rely on distributional assumptions, such as the t-statistic based, mixed-effects model, or Bayesian model methods. However, often the sample size for each individual microarray experiment is small. Therefore, in this paper we present a non-parametric meta-analysis approach for combining data from independent microarray studies, and illustrate its application on two independent Affymetrix GeneChip studies that compared the gene expression of biopsies from kidney transplant recipients with chronic allograft nephropathy (CAN) to those with normal functioning allograft.
The simulation study comparing the non-parametric meta-analysis approach to a commonly used t-statistic based approach shows that the non-parametric approach has better sensitivity and specificity. For the application on the two CAN studies, we identified 309 distinct genes that expressed differently in CAN. By applying Fisher's exact test to identify enriched KEGG pathways among those genes called differentially expressed, we found 6 KEGG pathways to be over-represented among the identified genes. We used the expression measurements of the identified genes as predictors to predict the class labels for 6 additional biopsy samples, and the predicted results all conformed to their pathologist diagnosed class labels.
We present a new approach for combining data from multiple independent microarray studies. This approach is non-parametric and does not rely on any distributional assumptions. The rationale behind the approach is logically intuitive and can be easily understood by researchers not having advanced training in statistics. Some of the identified genes and pathways have been reported to be relevant to renal diseases. Further study on the identified genes and pathways may lead to better understanding of CAN at the molecular level.