Skip to main content

Improving the power for detecting overlapping genes from multiple DNA microarray-derived gene lists

Abstract

Background

In DNA microarray gene expression profiling studies, a fundamental task is to extract statistically significant genes that meet certain research hypothesis. Currently, Venn diagram is a frequently used method for identifying overlapping genes that meet the investigator's research hypotheses. However this simple operation of intersecting multiple gene lists, known as the Intersection-Union Tests (IUTs), is performed without knowing the incurred changes in Type 1 error rate and can lead to loss of discovery power.

Results

We developed an IUT adjustment procedure, called Relaxed IUT (RIUT), which is proved to be less conservative and more powerful for intersecting independent tests than the traditional Venn diagram approach. The advantage of the RIUT procedure over traditional IUT is demonstrated by empirical Monte-Carlo simulation and two real toxicogenomic gene expression case studies. Notably, the enhanced power of RIUT enables it to identify overlapping gene sets leading to identification of certain known related pathways which were not detected using the traditional IUT method.

Conclusion

We showed that traditional IUT via a Venn diagram is generally conservative, which may lead to loss discovery power in DNA microarray studies. RIUT is proved to be a more powerful alternative for performing IUTs in identifying overlapping genes from multiple gene lists derived from microarray gene expression profiling.

Background

Nowadays many microarray-based studies adopt complex experimental design involving multiple treatments, cell lines/tissues, multiple dosages, time points, phenotypes and so on [1–4]. These studies are often involved with complex research hypotheses. For instance in one of our previous studies [1], we were interested in identifying differentially expressed genes (DEGs) in responding to common bile duct ligation in bone marrow stem cells (BMSCs) compared against primary hepatocytes. Two DEG sets from BMSCs and hepatocytes were identified respectively, and the overlapping genes across the two cell types were obtained. The overlapping genes produced across the two cell types allowed the identification of common biological pathways, ontological classes, and biological mechanisms across the two cell types in responding to the treatment.

The intersection operations on multiple gene lists are equivalent to performing multiple tests for the combined hypotheses on every single gene. Although there are many statistical tests proposed for gene expression studies [5–8], the problems of obtaining overlapping gene sets based on multiple tests were overlooked in microarray-based studies. To obtain genes that satisfy the specific hypotheses, researchers simply overlap the gene sets from multiple gene sets and visualized them in Venn diagrams. However, because of lacking multiplicity adjustment, this procedure overlooks the changes of statistical properties, i.e., power, type 1 error rate, p-values, during the intersection operations. This type of multiple testing for finding overlapping genes is known as the Intersection-Union Test (IUT). Despite some early efforts [9–11], the statistical properties and adjustment algorithms of IUT are not well established. Berger has proved that IUT without multiplicity adjustment is a level-α test [10], when the individual tests were controlled at type 1 error rate α. However, the family wise error rate (FWER) for IUT α' is generally much smaller than α. Therefore, performing IUT without multiplicity adjustment would be very conservative and result in too many false negatives.

In this paper, we show that current overlapping operation, applying no p-value adjustment for IUT, is overly conservative in general. As a result, current microarray studies suffer from low power in detecting overlapping genes and therefore limit its use in biological data mining. We developed an analytical solution, named as Relaxed IUT (RIUT) for the multiplicity adjustment of IUTs under certain conditions. We theoretically proved that our proposed method is a less conservative and more powerful than current approaches. We demonstrated the superiority of RIUT for detecting overlapping genes in simulated data sets and complex microarray-based toxicogenomic studies.

Results

Monte-Carlo simulation results of RIUT

As an example to showcase the power of RIUT, the mRNA expression of a given gene is tested whether it is significantly altered by a drug treatment in multiple tissues. Suppose gene expressions were measured in m different tissues and one is interested in the overlapping DEGs. For each tissue, a two-sample t-test is performed between a treatment group and a control group, each containing n replicates to obtain a list of significant genes for that tissue. Then we have an IUT that is constructed by m individual tests, each for a different tissue:

H0i: the drug has no effect in the i th tissue, i.e., μ ti = μ ci ,

H Ai : the drug has effect in the i th tissue, i.e., μ ti ≠ μ ci , 1 ≤ i ≤ m,

where μ ti and μ ci denote the expression mean of the treatment and the control groups respectively of the i th tissue. The hypotheses for IUT are H0: the drug shows no effect on at least one tissue vs. H A : the drug shows effect on all tissues.

For the Monte-Carlo simulation, the expression data for the treatment and control groups were modeled as normal distributions N(μ ti , 1) and N(μ ci , 1) respectively, where μ ci = 0, 1 ≤ i ≤ m. We then drew n = 5 samples from each of these distributions and apply RIUT and BIUT to these simulated data. Table 1 shows the estimated type 1 error rate by 10000 simulated instances of IUT formed by 2 individual tests. Expression mean μt 1is fixed at 0 for the first tissue and different values of μt 2were used to represent the drug having diverse effects (μt 2= 0, 0.5, ..., 4.5, 5.0) on the second tissue. Overall, the drug has no effect on the first tissue and the null hypothesis H0 of IUT is true. Results show that both RIUT and traditional Berger's IUT (BIUT) are bounded by nominal α at which the individual hypotheses were tested. The actual type 1 error for IUT is generally smaller than α. RIUT is less conservative than BIUT as it achieves a type 1 error rate that is closer to α. To prove concept, we also tested two meta-analysis methods for combining independent tests, the Fisher's method [15] and Stouffer's method [16]. The results show that their actual type 1 rate can be so much higher than the nominal one that these methods are not suitable for the IUTs.

Table 1 Monte-Carlo estimates of type 1 error rate α'(%)

At μt 1= μt 2= 0, the resultant p-value distributions generated from 10000 instances were illustrated in Figure 1. Theoretically, the p-values originated from null hypothesis should appear approximately uniformly distributed (the dashed line). The adjusted p' using RIUT achieved the desired distribution as shown in Figure 1a. However, Figure 1b shows that the unadjusted p is seriously skewed to the right, indicating that the test is overly conservative.

Figure 1
figure 1

Distributions of the p-values of 10000 simulations with μ t 1 = μ t 2 = 0.0 using RIUT and BIUT. a. The distribution of RIUT p-value p'. b. The distribution of unadjusted BIUT p-value p. The dashed line indicates the hypothetical uniform distribution of the p-values of the 10000 runs.

Table 2 shows the power estimate of IUT consisting of two individual tests and both individual null hypotheses are not true (μt 1= 0.5, μt 2= 0, 0.5, ..., 4.5, 5.0). RIUT demonstrated higher power than BIUT at small and moderate effect size. At large effect size, RIUT and BIUT show essentially the same power.

Table 2 Monte-Carlo estimates of power (%)

As a more realistic simulation, we pooled instances being positive (H0 not true) and negative (H0 true). We use γ to denote the simulation Bernoulli probability that an individual hypothesis is not null such that Prob(H Ai ) = γ. The population means were set at μt 1= μt 2= 0.5. According to this procedure, we simulated 10000 instances under different γ and the resultant number of true negative and true positive were denoted as K0 and K1 respectively. The identified false positive and true positives were notated as V and S respectively. The results in Table 3 confirmed our previous observations that RIUT is a more powerful and less conservative. The same patterns were observed in simulations using other parameter values (results not shown).

Table 3 Simulation results of pooled instances

Identifying overlapping genes that respond to multiple drug treatment

This example illustrates how RIUT algorithm can be used in real DNA microarray-based multiple-testing problems. Originally generated from the MAQC project [2], this data set consists of rat RNA samples that came from six treatment/tissue groups. The treatment/tissue groups were aristolochic acid/liver, aristolochic acid/kidney, riddelliine/liver, comfrey/liver, control/liver and control/kidney. There were six biological replicates in each treatment/tissue group. mRNA expression profiles were obtained using four commercial platforms including Affymetrix (Rat Genome 230 2.0), Agilent (Whole Rat Genome Oligo Microarray, G4131A), Applied Biosystems (Rat Genome Survey Microarray) and GE Healthcare (RatWhole Genome Bioarray, 300031) in five different labs with two labs using the Affymetrix microarray platform. Totally, 180 chips were obtained and the cross-platform probe-mapping gave rise to 4609 genes commonly detected across four platforms.

In this example, our goal is to identify the common genes responding to different drug treatments. These genes may shed lights on common cytotoxicity mechanisms of these drugs. We used the liver/control (L_CTL) group as control and the 3 drugs treatment at rat liver were referred to as L_AA, L_CFY, and L_RDL respectively. The IUT consists of two individual tests, (i.e., t1: L_AA vs. L_CTL and t2: L_CFY vs. L_CTL), each testing whether a gene differentially expressed in response to one specific drug treatment versus control. The combined IUT was used to identify the genes that differentially expressed in combined each pair of the three treatments. Table 4 shows the number of overlapping differentially expressed genes identified using RIUT compared with the traditional BIUT at different labs/platforms. As expected, the number of significant genes obtained using RIUT is consistently greater than that obtained using BIUT. The magnitude of increase ranges from 13% to 184%. For many IUTs, the number of identified overlapping genes using traditional BIUT is close to the number of nominal false positives (230), suggesting that BIUT lacks power to identify true overlapping differentially expressed genes. The testing results using real microarray data confirm our analytical results of Theorem 1 and Theorem 2 (shown in Methods). In addition our Monte-Carlo simulation demonstrates that RIUT is a more powerful and less conservative approach than BIUT.

Table 4 Number of differentially expressed genes in all drug treatment groups

Detecting genes with time-course and dose-response effect to chemical treatment

To further demonstrate the applicability of RIUT in microarray studies, we focus on the rat cadmium toxicogenomic data set [3, 18]. This study employed a more complex study design, in which both gene expression and cytotoxicity changes were profiled in a multi-dose multi-time-point setting. Briefly, primary rat hepatocytes were isolated and were exposed to three different doses of cadmium acetate (0, 1.25 and 2.0 μM) for 2 h. Cells were collected at 0, 3, 6, 12 and 24 h in all three groups (0, 1.25 and 2.0 μM Cd) for cytotoxicity evaluation by lactase dehydrogenase (LDH) leakage as well as for mRNA expression profiling by DNA microarray. Affymetrix GeneChip® oligonucleotide arrays (RatTox U34) were used for mRNA expression profiling. There are 972 probe sets representing ~800 important toxicology-related genes in the RT U34 array. The microarray experiment was repeated using primary hepatocytes from 3 animals, each with 2 replicates (independent cultures) for each dosage (3 dosage levels) at each time point (5 time points), resulting in a total of 90 chips (3 animals • 2 replicates • 3 doses • 5 time points). The 2 replicates were averaged in our analysis.

Firstly, to identify differentially expressed genes in responding to cadmium treatment at each time point, two-sample t-tests were performed between the treatment (1.25 and 2.0 μM Cd) and control at each time point. Secondly, to identify the genes with persistent differentially expression due to cadmium exposure across different time points, the overlapping of the DEGs at both short term (3 h) and long term (12 h) were identified using our proposed method. The second research question was then formulated an IUT problem which could be solved using our proposed method.

We constructed an IUT consisting of two individual tests (t1: 2.0 μM Cd vs control at 3 h, and t2: 2.0 μM Cd vs control at 12 h). The joint p-value distribution from t1 and t2 for all genes is illustrated in Figure 2. It is shown that t1 and t2 were approximately independent (R2 = 0.03) and their p-values were approximately uniformly distributed under null hypothesis (p1 > 0.05 and p2 > 0.05). Therefore the assumptions in Theorem 1 and 2 were not violated. In fact for most microarray-based studies, these assumptions need to be checked by scatter plot.

Figure 2
figure 2

The joint p-value distribution from t 1 and t 2 for all genes. The IUT consists of two individual two-sample t tests (t1: 2.0 μM Cd vs. control at 3 h, t2: 2.0 μM Cd vs. control at 12 h).

We performed RIUT and BIUT (m = 2, n = 3) for all 972 probe sets. RIUT identified 80 overlapping probe sets and traditional BIUT identified 19 at α = 0.05. The power of IUT methods directly affected the identified gene set on which the follow-up pathway interpretation was based. To demonstrate this effect, a pathway enrichment search was performed by comparing the IUT-identified gene sets with the specific KEGG pathways [19–21] using a Fisher's Exact Test on DAVID 2006 [22], resulting in an enrichment p-value for each pathway which were listed in Table 5. Using the RIUT gene set, we identified 3 significantly over-enriched pathways (Fisher Exact p-value < 0.05). However, using the BIUT gene set, no significant pathways in KEGG was identified. It has been reported that the metabolism of xenobiotics by cytochrome p450 pathway is significantly affected by cadmium exposure [23, 24]. The activity of MAPK signaling pathway and porphyrin and chlorophyll metabolism pathway have also been affected by environmental cadmium exposure [25, 26]. Again, Our analysis results using Cd time-course data set indicate that RIUT is a more powerful method than BIUT and the significance can be seen in the biological pathway identification.

Table 5 KEGG pathways Identified and the enrichment p-values

Discussion

RIUT can be improved in three ways to be applicable in more scenarios: (1) estimating unknown nuisance parameters; (2) dealing with more than 2 individual tests; and (3) combining non-independent tests. For the scenario (1), the current approach for estimating π with an arbitrary λ was originally proposed by Storey [14] for estimating false discovery rate. The authors proposed a bootstrap method for finding optimal λ which can also be used here. It should be noted that Theorem 2 is valid no matter how the λ is chosen and how good π is estimated. For IUTs consisting of more than 2 individual tests, it is difficult to obtain the analytical solution as Theorem 1. However, we can apply a step-up procedure which agglomeratively applies RIUT on its least significant individual test. It is more difficult to extend our procedure for non-independent tests. It may need a resampling-based algorithm to incorporating correlation structure of multiple tests and dealing with non-normality issues. Resampling needs intensive computation which can be largely offset by today's powerful and inexpensive computing facility. Resampling of IUT is based on resampling of individual tests which can be conveniently performed by either bootstrap or permutation. Bootstrap and permutation were discussed in many literatures [12, 27, 28].

Conclusion

Our study demonstrated that the current unadjusted IUT approaches were overly conservative, which resulted in loss of power in finding overlapping genes in microarray-based gene expression studies. Our proposed RIUT was analytically proved to be a more powerful and less conservative approach than the current unadjusted IUT. The power improvement is more apparent in tests with weak and moderate effect sizes. This is also demonstrated in Monte-Carlo simulations and real case studies. In addition, certain known biologically relevant pathways were identified using the RIUT-derived overlapping genes which were not detected by using the traditional BIUT.

Appendix

Let X denote the random vector of data values. Suppose the probability distribution of X depends on an unknown parameter θ. The set of possible values for θ will be denoted by Θ. Suppose we have m individual tests and let R i denote a rejection region for a level-α test of H0i: θ ∈ Θ i versus H Ai : θ ∈ Θ i c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiMde1aa0baaSqaaiabdMgaPbqaaiabdogaJbaaaaa@3023@ , 1 ≤ i ≤ m, where Θ i is a specified subset of Θ and Θ i c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeuiMde1aa0baaSqaaiabdMgaPbqaaiabdogaJbaaaaa@3023@ is its complement. Then IUT tests the union of sets against an intersection of sets. H 0 : θ ∈ Θ 0 = ∪ i = 1 m Θ i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemisaG0aaSbaaSqaaiabicdaWaqabaGccqGG6aGocqaH4oqCcqGHiiIZcqqHyoqudaWgaaWcbaGaeGimaadabeaakiabg2da9maatadabaGaeuiMde1aaSbaaSqaaiabdMgaPbqabaaabaGaemyAaKMaeyypa0JaeGymaedabaGaemyBa0ganiablQIivbaaaa@3F23@ versus H A : θ ∈ Θ 0 c = ∩ i = 1 m Θ i c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemisaG0aaSbaaSqaaiabdgeabbqabaGccqGG6aGocqaH4oqCcqGHiiIZcqqHyoqudaqhaaWcbaGaeGimaadabaGaem4yamgaaOGaeyypa0ZaaqbmaeaacqqHyoqudaqhaaWcbaGaemyAaKgabaGaem4yamgaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabd2gaTbqdcqWIPissaaaa@41D4@ , with the rejection region R = ∩ i = 1 m R i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOuaiLaeyypa0ZaaqbmaeaacqWGsbGudaWgaaWcbaGaemyAaKgabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGTbqBa0GaeSykIKeaaaa@36EC@ . In other words, the IUT rejects only if all of the tests reject.

Berger's Theorem: IUT with rejection region R is a level-α test of H0 versus H A .

Proof. For any θ ∈ Θ0 and for any 1 ≤ l ≤ m, we have P θ ( R ) = P θ ( ∩ i = 1 m R i ) ≤ P θ ( R l ) ≤ α MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiuaa1aaSbaaSqaaiabeI7aXbqabaGccqGGOaakcqWGsbGucqGGPaqkcqGH9aqpcqWGqbaudaWgaaWcbaGaeqiUdehabeaakiabcIcaOmaauadabaGaemOuai1aaSbaaSqaaiabdMgaPbqabaaabaGaemyAaKMaeyypa0JaeGymaedabaGaemyBa0ganiablMIijbGccqGGPaqkcqGHKjYOcqWGqbaudaWgaaWcbaGaeqiUdehabeaakiabcIcaOiabdkfasnaaBaaaleaacqWGSbaBaeqaaOGaeiykaKIaeyizImQaeqySdegaaa@4D18@ . Therefore, is IUT is a level-α test. ▪

Methods

IUT and UIT

Suppose a gene on which a number of α-level hypothesis tests were performed, represented as t1, t2,..., t m , where m is the number of individual tests. Each test t i tests the null hypothesis H0iversus alternative hypothesis H Ai . We can combine all tests into a Union-Intersection Test (UIT) which rejects if any of the t i rejects. We can also combine all tests into an Intersection-Union Test (IUT) which rejects if all the t i reject. The UIT tests the hypothesis H0 = {all H0iare true} against H A = {at least one H0iis false} and the IUT tests the hypothesis H0 = {at least one H0iis true} against H A = {all H0iare false}. IUT and UIT were named from the fact that their null and alternative hypothesis can be described by set intersections and unions. (see Appendix for details).

The FWER of UIT is defined as α' = Pr(Reject at least one H0i| all H0iare true). It is well known that in general α' ≠ α. For example, if α = 0.05 and m = 5, α' would be about 0.23 when all individual tests are independent. Therefore there is a need to adjust α' for IUT and there exist many procedures to do so, ranging from simple Bonferroni correction to computer-intensive resampling-based correction [12, 13]. The FWER IUT is α' = Pr(Reject all H0i| at least one H0iis true). It is also obvious that α' ≠ α for IUT in general. However unlike the well studied UIT, there is no known procedure for adjusting the FWER α' for IUT. The unadjusted IUT, also known as the Berger's approach, denoted as BIUT, suggests that the overall unadjusted p-value for IUT is

p = max p i , 1 ≤ i ≤ m, (1)

where p i is the p-value for individual tests t i . Berger proved [10] that the unadjusted IUT is a level-α test if all t i are level-α tests. Berger's also showed that the above IUT is a size-α test under certain trivial case such as the case when exactly one H0iis true while all the other H0iare false. However, the unadjusted approach is not a size-α test in general. For example, when considering two independent individual tests t1 and t2, the chance of rejecting both hypotheses is α2 rather than α if both H01 and H02 are true. Nonetheless, due to its simplicity, the unadjusted IUT approach was implicitly adopted by current microarray studies when overlapping genes were taken from several significant gene lists. This BIUT is equivalent to the Venn diagram in obtaining overlapping genes from multiple significant gene lists.

Exact solution for IUT consisting of independent tests

Berger's approach can be very conservative and therefore substantial power could be lost for detecting overlapping genes. Given the observed p-values (p1, p2,..., p m ) for the m individual tests, we are interested in estimate p-value for the entire test H0 of IUT. Here, we define a Westfall-Young-style p-value [12] for IUT denoted as p',

p ' = Pr ( all  P j ≤ p | H 0 ) = Pr ( max P j ≤ max p i | H 0 ) , 1 ≤ i , j ≤ m , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeaabiqaaaqaaiabdchaWjabcEcaNiabg2da9iGbccfaqjabckhaYjabcIcaOiabbggaHjabbYgaSjabbYgaSjabbccaGiabdcfaqnaaBaaaleaacqWGQbGAaeqaaOGaeyizImQaemiCaa3aaqqaaeaacqWGibasdaWgaaWcbaGaeGimaadabeaaaOGaay5bSdGaeiykaKcabaqbaeqabeGaaaqaaiabg2da9iGbccfaqjabckhaYnaabmaabaGagiyBa0MaeiyyaeMaeiiEaGNaemiuaa1aaSbaaSqaaiabdQgaQbqabaGccqGHKjYOcyGGTbqBcqGGHbqycqGG4baEcqWGWbaCdaWgaaWcbaGaemyAaKgabeaakmaaeeaabaGaemisaG0aaSbaaSqaaiabicdaWaqabaaakiaawEa7aaGaayjkaiaawMcaaiabbYcaSaqaaiabigdaXiabgsMiJkabdMgaPjabcYcaSiabdQgaQjabgsMiJkabd2gaTjabcYcaSaaaaaaaaa@6603@
(2)

where Pj denote the distribution for the p-value of the j th hypothesis under null hypothesis for IUT H0 = {at least one H0iis true}. The least significant p-value p is an observed statistic and the random variable max P j is the test statistic under H0. This definition is intuitive as p-value measures the probability of false positive under null hypothesis, where a false positive of IUT means that all the p-values under H0 are less than the observed p. Similar to the above equation, we have the following relationship between the IUT FWER α', and individual test type 1 error rate α.

α' = Pr(all P j ≤ α |H0) 1 ≤ i, j ≤ m (3)

Unlike UIT, the null hypothesis H0 of IUT is a composite hypothesis and contains nuisance parameters. However under certain conditions, it is possible to derive the analytical solution for α'. We obtain the following theorems:

Theorem 1

For an IUT constructed from two independent tests t1: H01 vs. HA 1, and t2: H02 vs. HA 2, both at significance level α, suppose the observed p-values for t1 and t2 are uniformly distributed under H01 and H02 respectively, the exact FWER α' for the IUT is controlled at

α ' = ( 1 − π 1 ) π 2 ( 1 − β 2 ) α + ( 1 − π 2 ) π 1 ( 1 − β 1 ) α + ( 1 − π 1 ) ( 1 − π 2 ) α 2 1 − π 1 π 2 , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeqySdeMaei4jaCIaeyypa0tcfa4aaSaaaeaacqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGOmaidabeaacqGGPaqkcqaHXoqycqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabikdaYaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGymaedabeaacqGGPaqkcqaHXoqycqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcMcaPiabeg7aHnaaCaaabeqaaiabikdaYaaaaeaacqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeqiWda3aaSbaaeaacqaIYaGmaeqaaaaacqGGSaalaaa@6BE6@
(4)

where the true probabilities of alternative hypotheses are Pr(HA 1) = π1 and Pr(HA 2) = π2; the type 2 error rate for t1 and t2 are β1 and β2.

Proof:

α ' = Pr (  max( P 1  , P 2 ) ≤ α | H 0 ) = Pr ( P 1 ≤ α , P 2 ≤ α | H 0 ) = Pr ( P 1 ≤ α , P 2 ≤ α , H 0 ) Pr ( H 0 ) = Pr ( P 1 ≤ α , P 2 ≤ α , ( H 01  or  H 02 ) ) Pr ( H 01  or  H 02 ) = Pr ( P 1 ≤ α , P 2 ≤ α , H 01 , H A 2 ) + Pr ( P 1 ≤ α , P 2 ≤ α , H A 1 , H 02 ) + Pr ( P 1 ≤ α , P 2 ≤ α , H 01 , H 02 ) / 1 − Pr ( H A 1 )Pr( H 02 )    independence = ( 1 − Ï€ 1 ) Ï€ 2 ( 1 − β 2 ) α + ( 1 − Ï€ 2 ) Ï€ 1 ( 1 − β 1 ) α + ( 1 − Ï€ 1 ) ( 1 − Ï€ 2 ) α 2 1 − Ï€ 1 Ï€ 2  uniform MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGceaqabeaacqaHXoqycqGGNaWjcqGH9aqpcyGGqbaucqGGYbGCcqGGOaakcqqGGaaicqqGTbqBcqqGHbqycqqG4baEcqqGOaakcqWGqbaudaWgaaWcbaGaeGymaedabeaakiabbccaGiabbYcaSiabdcfaqnaaBaaaleaacqaIYaGmaeqaaOGaeiykaKIaeyizImQaeqySde2aaqqaaeaacqWGibasdaWgaaWcbaGaeGimaadabeaaaOGaay5bSdGaeiykaKcabaGaeyypa0JagiiuaaLaeiOCaiNaeiikaGIaeeiiaaIaemiuaa1aaSbaaSqaaiabigdaXaqabaGccqqGGaaicqGHKjYOcqaHXoqycqGGSaalcqWGqbaudaWgaaWcbaGaeGOmaidabeaakiabgsMiJkabeg7aHnaaeeaabaGaemisaG0aaSbaaSqaaiabicdaWaqabaaakiaawEa7aiabcMcaPaqaaiabg2da9KqbaoaalaaabaGagiiuaaLaeiOCaiNaeiikaGIaeeiiaaIaemiuaa1aaSbaaeaacqaIXaqmaeqaaiabbccaGiabgsMiJkabeg7aHjabcYcaSiabdcfaqnaaBaaabaGaeGOmaidabeaacqGHKjYOcqaHXoqycqqGSaalcqWGibasdaWgaaqaaiabicdaWaqabaGaeiykaKcabaGagiiuaaLaeiOCaiNaeiikaGIaemisaG0aaSbaaeaacqaIWaamaeqaaiabcMcaPaaaaOqaaiabg2da9maalaaabaGagiiuaaLaeiOCaiNaeiikaGIaeeiiaaIaemiuaa1aaSbaaSqaaiabigdaXaqabaGccqqGGaaicqGHKjYOcqaHXoqycqGGSaalcqWGqbaudaWgaaWcbaGaeGOmaidabeaakiabgsMiJkabeg7aHjabcYcaSiabbIcaOiabdIeainaaBaaaleaacqaIWaamcqaIXaqmaeqaaOGaeeiiaaIaee4Ba8MaeeOCaiNaeeiiaaIaemisaG0aaSbaaSqaaiabicdaWiabikdaYaqabaGccqqGPaqkcqGGPaqkaeaacyGGqbaucqGGYbGCcqGGOaakcqqGGaaicqWGibasdaWgaaWcbaGaeGimaaJaeGymaedabeaakiabbccaGiabb+gaVjabbkhaYjabbccaGiabdIeainaaBaaaleaacqaIWaamcqaIYaGmaeqaaOGaeiykaKcaaaqaaiabg2da9maalyaaeaqabeaacyGGqbaucqGGYbGCcqGGOaakcqqGGaaicqWGqbaudaWgaaWcbaGaeGymaedabeaakiabbccaGiabgsMiJkabeg7aHjabcYcaSiabdcfaqnaaBaaaleaacqaIYaGmaeqaaOGaeyizImQaeqySdeMaeiilaWIaemisaG0aaSbaaSqaaiabicdaWiabigdaXaqabaGccqGGSaalcqWGibasdaWgaaWcbaGaemyqaeKaeGOmaidabeaakiabcMcaPiabgUcaRaqaaiGbccfaqjabckhaYjabcIcaOiabbccaGiabdcfaqnaaBaaaleaacqaIXaqmaeqaaOGaeeiiaaIaeyizImQaeqySdeMaeiilaWIaemiuaa1aaSbaaSqaaiabikdaYaqabaGccqGHKjYOcqaHXoqycqGGSaalcqWGibasdaWgaaWcbaGaemyqaeKaeGymaedabeaakiabcYcaSiabdIeainaaBaaaleaacqaIWaamcqaIYaGmaeqaaOGaeiykaKIaey4kaScabaGagiiuaaLaeiOCaiNaeiikaGIaeeiiaaIaemiuaa1aaSbaaSqaaiabigdaXaqabaGccqqGGaaicqGHKjYOcqaHXoqycqGGSaalcqWGqbaudaWgaaWcbaGaeGOmaidabeaakiabgsMiJkabeg7aHjabcYcaSiabdIeainaaBaaaleaacqaIWaamcqaIXaqmaeqaaOGaeiilaWIaemisaG0aaSbaaSqaaiabicdaWiabikdaYaqabaGccqGGPaqkaaqaaiabigdaXiabgkHiTiGbccfaqjabckhaYjabcIcaOiabbccaGiabdIeainaaBaaaleaacqWGbbqqcqaIXaqmaeqaaOGaeeykaKIaeeiuaaLaeeOCaiNaeeikaGIaemisaG0aaSbaaSqaaiabicdaWiabikdaYaqabaGccqGGPaqkaaGaeeiiaaIaeeiiaaIaeeiiaaIaeeyAaKMaeeOBa4MaeeizaqMaeeyzauMaeeiCaaNaeeyzauMaeeOBa4MaeeizaqMaeeyzauMaeeOBa4Maee4yamMaeeyzaugabaGaeyypa0tcfa4aaSaaaeaacqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGOmaidabeaacqGGPaqkcqaHXoqycqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabikdaYaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGymaedabeaacqGGPaqkcqaHXoqycqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcMcaPiabeg7aHnaaCaaabeqaaiabikdaYaaaaeaacqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeqiWda3aaSbaaeaacqaIYaGmaeqaaaaakiabbccaGiabbwha1jabb6gaUjabbMgaPjabbAgaMjabb+gaVjabbkhaYjabb2gaTbaaaa@685E@

â–ª

Recall that the p-value is the lowest level of significance at which the null hypothesis could have been rejected. We can obtain the adjusted p-value based on the observed p-values of the individual tests

p ' = ( 1 − π 1 ) π 2 ( 1 − β 2 ) p + ( 1 − π 2 ) π 1 ( 1 − β 1 ) p + ( 1 − π 1 ) ( 1 − π 2 ) p 2 1 − π 1 π 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaei4jaCIaeyypa0tcfa4aaSaaaeaacqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGOmaidabeaacqGGPaqkcqWGWbaCcqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabikdaYaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabcIcaOiabigdaXiabgkHiTiabek7aInaaBaaabaGaeGymaedabeaacqGGPaqkcqWGWbaCcqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcMcaPiabdchaWnaaCaaabeqaaiabikdaYaaaaeaacqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeqiWda3aaSbaaeaacqaIYaGmaeqaaaaaaaa@6A2E@
(5)

where p = max(p1, p2).

Theorem 2

The RIUT procedure is universally at least as powerful as the unadjusted IUT, such that p' ≤ p.

Proof:

p ' ≤ ( 1 − π 1 ) π 2 p + ( 1 − π 2 ) π 1 p + ( 1 − π 1 ) ( 1 − π 2 ) p 2 1 − π 1 π 2 ≤ ( 1 − π 1 ) π 2 + ( 1 − π 2 ) π 1 + ( 1 − π 1 ) ( 1 − π 2 ) 1 − π 1 π 2 p ≤ p MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGceaqabeaacqWGWbaCcqGGNaWjcqGHKjYOjuaGdaWcaaqaaiabcIcaOiabigdaXiabgkHiTiabec8aWnaaBaaabaGaeGymaedabeaacqGGPaqkcqaHapaCdaWgaaqaaiabikdaYaqabaGaemiCaaNaey4kaSIaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcMcaPiabec8aWnaaBaaabaGaeGymaedabeaacqWGWbaCcqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIYaGmaeqaaiabcMcaPiabdchaWnaaCaaabeqaaiabikdaYaaaaeaacqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabigdaXaqabaGaeqiWda3aaSbaaeaacqaIYaGmaeqaaaaaaOqaaiabgsMiJMqbaoaalaaabaGaeiikaGIaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabcMcaPiabec8aWnaaBaaabaGaeGOmaidabeaacqGHRaWkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabikdaYaqabaGaeiykaKIaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabgUcaRiabcIcaOiabigdaXiabgkHiTiabec8aWnaaBaaabaGaeGymaedabeaacqGGPaqkcqGGOaakcqaIXaqmcqGHsislcqaHapaCdaWgaaqaaiabikdaYaqabaGaeiykaKcabaGaeGymaeJaeyOeI0IaeqiWda3aaSbaaeaacqaIXaqmaeqaaiabec8aWnaaBaaabaGaeGOmaidabeaaaaGccqWGWbaCaeaacqGHKjYOcqWGWbaCaaaa@8DA4@

â–ª

The above derivation can be visualized in Figure 3, which shows the partition of outcome space (p1, p2) of the two independent tests.

Figure 3
figure 3

Partition of the sample space of the two dimensional outcome space defined by p 1 and p 2 . The shaded two areas are the true non-null areas for the two individual tests respectively. The rejected regions by both tests are highlighted in bold rectangles at the four corners of the outcome space. Only the lower-left corner is the non-null region from both tests and therefore should be rejected by IUT. The other three corners are type 1 error region for IUT because at least one individual test is making type 1 error.

The relationship between α' and α is much more complicated in the IUT than that in the UIT. To apply the above theorem, the nuisance parameters (π1, π2, β1, β2) need to be estimated. Since there are usually thousands of genes available for each statistical test, we can obtain crude and conservative estimates of the parameters according to [14].

π ^ i ( λ ) = # { p i ( j ) < λ } ( 1 − λ ) n i = 1 , 2 , j = 1 , ... , n , and β ^ 1 = β ^ 2 = 0 , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabeqbaaaabaGafqiWdaNbaKaadaWgaaWcbaGaemyAaKgabeaakiabcIcaOiabeU7aSjabcMcaPiabg2da9KqbaoaalaaabaGaei4iamIaei4EaSNaemiCaa3aaSbaaeaacqWGPbqAaeqaaiabcIcaOiabdQgaQjabcMcaPiabgYda8iabeU7aSjabc2ha9bqaaiabcIcaOiabigdaXiabgkHiTiabeU7aSjabcMcaPiabd6gaUbaaaOqaaiabdMgaPjabg2da9iabigdaXiabcYcaSiabikdaYiabcYcaSaqaaiabdQgaQjabg2da9iabigdaXiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiabd6gaUjabcYcaSaqaaiabbggaHjabb6gaUjabbsgaKbqaaiqbek7aIzaajaWaaSbaaSqaaiabigdaXaqabaGccqGH9aqpcuaHYoGygaqcamaaBaaaleaacqaIYaGmaeqaaOGaeyypa0JaeGimaaJaeiilaWcaaaaa@6556@
(6)

where n is the total number of genes, λ is a chosen fixed value at 0.25 and p i (j) represents the observed p-value for the i th test on the j th gene.

References

  1. Wang C, Chelly MR, Chai N, Tan Y, Hui T, Li H: Transcriptomic fingerprinting of bone marrow-derived hepatic beta2m-/Thy-1+ stem cells. Biochem Biophys Res Commun. 2005, 327: 252-260. 10.1016/j.bbrc.2004.11.159.

    Article  CAS  PubMed  Google Scholar 

  2. Guo L, Lobenhofer EK, Wang C, Shippy R, Harris SC, Zhang L: Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat Biotechnol. 2006, 24: 1162-1169. 10.1038/nbt1238.

    Article  CAS  PubMed  Google Scholar 

  3. Tan Y, Shi L, Hussain SM, Xu J, Tong W, Frazier JM: Integrating time-course microarray gene expression profiles with cytotoxicity for identification of biomarkers in primary rat hepatocytes exposed to cadmium. Bioinformatics. 2006, 22: 77-87. 10.1093/bioinformatics/bti737.

    Article  CAS  PubMed  Google Scholar 

  4. Lim DA, Suarez-Farinas M, Naef F, Hacker CR, Menn B, Takebayashi H: In vivo transcriptional profile analysis reveals RNA splicing and chromatin remodeling as prominent processes for adult neurogenesis. Mol Cell Neurosci. 2006, 31: 131-148. 10.1016/j.mcn.2005.10.005.

    Article  CAS  PubMed  Google Scholar 

  5. Kooperberg C, Aragaki A, Strand AD, Olson JM: Significance testing for small microarray experiments. Stat Med. 2005, 24: 2281-2298. 10.1002/sim.2109.

    Article  PubMed  Google Scholar 

  6. Wang A, Gehan EA: Gene selection for microarray data analysis using principal component analysis. Stat Med. 2005, 24: 2069-2087. 10.1002/sim.2082.

    Article  PubMed  Google Scholar 

  7. Baldi P, Long AD: A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics. 2001, 17: 509-519. 10.1093/bioinformatics/17.6.509.

    Article  CAS  PubMed  Google Scholar 

  8. Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  9. Berger RL, Hsu JC: Bioequivalence trials, intersection-union tests, and equivalence confidence sets. Statistical Science. 1996, 11: 283-319. 10.1214/ss/1032280304.

    Article  Google Scholar 

  10. Berger RL: Multiparameter hypothesis testing and acceptance sampling. Technometrics. 1982, 24: 295-300. 10.2307/1267823.

    Article  Google Scholar 

  11. Allison DB, Cui X, Page GP, Sabripour M: Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006, 7: 55-65. 10.1038/nrg1749.

    Article  CAS  PubMed  Google Scholar 

  12. Westfall PH, Young SS: Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. 1993, New York: Wiley-Interscience

    Google Scholar 

  13. Miller RGJ: Simultaneous Statistical Inference. 1991, New York: Springer-Verlag

    Google Scholar 

  14. Storey J: A direct approach to false discovery rates. J R Statist Soc B. 2002, 64: 479-498. 10.1111/1467-9868.00346.

    Article  Google Scholar 

  15. Fisher RA: Statistical Methods for Research Workers. 2007, Edinburgh: Oliver and Boyd

    Google Scholar 

  16. Hedges LV, Olkin I: Statistical Methods for Meta-analysis. 1985, San Diego, California, USA: Academic Press Inc, Harcourt Brace Jovanovich Publishers

    Google Scholar 

  17. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC: The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006, 24: 1151-1161. 10.1038/nbt1239.

    Article  CAS  PubMed  Google Scholar 

  18. Tan Y, Shi L, Tong W, Wang C: Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data 5. Nucleic Acids Res. 2005, 33: 56-65. 10.1093/nar/gki144.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Kanehisa M, Goto S, Hattori M, oki-Kinoshita KF, Itoh M, Kawashima S: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34: D354-D357. 10.1093/nar/gkj102.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  20. Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28: 27-30. 10.1093/nar/28.1.27.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Kanehisa M: A database for post-genome analysis. Trends Genet. 1997, 13: 375-376. 10.1016/S0168-9525(97)01223-7.

    Article  CAS  PubMed  Google Scholar 

  22. Dennis G, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003, 4 (5): P3-10.1186/gb-2003-4-5-p3.

    Article  PubMed  Google Scholar 

  23. Bozcaarmutlu A, Arinc E: Effect of mercury, cadmium, nickel, chromium and zinc on kinetic properties of NADPH-cytochrome P450 reductase purified from leaping mullet (Liza saliens). Toxicol In Vitro. 2007, 21: 408-416. 10.1016/j.tiv.2006.10.002.

    Article  CAS  PubMed  Google Scholar 

  24. Plewka A, Plewka D, Nowaczyk G, Brzoska MM, Kaminski M, Moniuszko-Jakoniuk J: Effects of chronic exposure to cadmium on renal cytochrome P450-dependent monooxygenase system in rats. Arch Toxicol. 2004, 78: 194-200. 10.1007/s00204-003-0529-9.

    Article  CAS  PubMed  Google Scholar 

  25. Zaccaro MC, Salazar C, Zulpa dC, Storni dC, Stella AM: Lead toxicity in cyanobacterial porphyrin metabolism. Environ Toxicol. 2001, 16: 61-67. 10.1002/1522-7278(2001)16:1<61::AID-TOX70>3.0.CO;2-L.

    Article  CAS  PubMed  Google Scholar 

  26. Komatsu M, Furukawa T, Ikeda R, Takumi S, Nong Q, Aoyama K: Involvement of mitogen-activated protein kinase signaling pathways in microcystin-LR-induced apoptosis after its selective uptake mediated by OATP1B1 and OATP1B3. Toxicol Sci. 2007, 97: 407-416. 10.1093/toxsci/kfm054.

    Article  CAS  PubMed  Google Scholar 

  27. Efron B, Tibshirani R: An introduction to the bootstrap. 1993, New York: Chapman&Hall

    Book  Google Scholar 

  28. Dudoit S, Yang Y, Matthew J, Speed TP: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. 2000

    Google Scholar 

Download references

Acknowledgements

This article has been published as part of BMC Bioinformatics Volume 9 Supplement 6, 2008: Symposium of Computations in Bioinformatics and Bioscience (SCBB07). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/9?issue=S6.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

XD and CW conceived of the study. XD drafted the manuscript and implemented the algorithm and performed most of the analysis. JX helped in data acquisition and results interpretation. All authors revised and approved the paper.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Deng, X., Xu, J. & Wang, C. Improving the power for detecting overlapping genes from multiple DNA microarray-derived gene lists. BMC Bioinformatics 9 (Suppl 6), S14 (2008). https://doi.org/10.1186/1471-2105-9-S6-S14

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-9-S6-S14

Keywords