This article is part of the supplement: Second Annual MidSouth Computational Biology and Bioinformatics Society Conference. Bioinformatics: a systems approach

Open Access Proceedings

Bioinformatics approaches for cross-species liver cancer analysis based on microarray gene expression profiling

H Fang1, W Tong2*, R Perkins1, L Shi2, H Hong1, X Cao1, Q Xie1, SH Yim3, JM Ward4, HC Pitot5 and YP Dragan2

Author Affiliations

1 Division of Bioinformatics, Z-Tech Corporation, 3900 NCTR Road, Jefferson, AR 72079

2 Division of Systems Toxicology, National Center for Toxicological Research (NCTR), FDA, 3900 NCTR Road, Jefferson, AR 72079

3 Laboratory of Metabolism, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland 20892

4 Verterinary and Tumor Pathology Section, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702

5 McArdle Laboratory for Cancer Research, University of Wisconsin, Madison, WI 53706

For all author emails, please log on.

BMC Bioinformatics 2005, 6(Suppl 2):S6  doi:10.1186/1471-2105-6-S2-S6

The electronic version of this article is the complete one and can be found online at:

Published:15 July 2005

© 2006 Fang et al; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



The completion of the sequencing of human, mouse and rat genomes and knowledge of cross-species gene homologies enables studies of differential gene expression in animal models. These types of studies have the potential to greatly enhance our understanding of diseases such as liver cancer in humans. Genes co-expressed across multiple species are most likely to have conserved functions. We have used various bioinformatics approaches to examine microarray expression profiles from liver neoplasms that arise in albumin-SV40 transgenic rats to elucidate genes, chromosome aberrations and pathways that might be associated with human liver cancer.


In this study, we first identified 2223 differentially expressed genes by comparing gene expression profiles for two control, two adenoma and two carcinoma samples using an F-test. These genes were subsequently mapped to the rat chromosomes using a novel visualization tool, the Chromosome Plot. Using the same plot, we further mapped the significant genes to orthologous chromosomal locations in human and mouse. Many genes expressed in rat 1q that are amplified in rat liver cancer map to the human chromosomes 10, 11 and 19 and to the mouse chromosomes 7, 17 and 19, which have been implicated in studies of human and mouse liver cancer. Using Comparative Genomics Microarray Analysis (CGMA), we identified regions of potential aberrations in human. Lastly, a pathway analysis was conducted to predict altered human pathways based on statistical analysis and extrapolation from the rat data. All of the identified pathways have been known to be important in the etiology of human liver cancer, including cell cycle control, cell growth and differentiation, apoptosis, transcriptional regulation, and protein metabolism.


The study demonstrates that the hepatic gene expression profiles from the albumin-SV40 transgenic rat model revealed genes, pathways and chromosome alterations consistent with experimental and clinical research in human liver cancer. The bioinformatics tools presented in this paper are essential for cross species extrapolation and mapping of microarray data, its analysis and interpretation.


For decades, classical toxicology has used risk assessments based on animal studies for regulatory decisions. The underlying assumption is that important biological functions are often conserved across species. In continuation of this paradigm, the effort in toxicogenomics is placed on studying rodents and other surrogates using advanced genomics technologies, such as DNA microarrays. Microarray studies enable simultaneous measurement of the expression of large numbers of genes. Given the completion of the DNA sequence of the human, mouse and rat genomes [1-3], genes identified in microarray studies can be readily compared across-species with respect to the gene orthologs [4,5]. This assumes that genes co-expressed across multiple species are likely to have conserved functions [6-8]. Thus, microarray analysis offers the possibility of furthering our understanding of cross-species commonalities and differences that could lead to more effective use of animal models to understand the cause and progression of diseases in human at the mechanistic level.

Hepatocellular carcinoma (HCC) is a leading cause of death worldwide and, like most cancers, is a genetic disease caused by the accumulation of genetic and epigenetic cell alterations. The progression of hepatic neoplasia is characterized by increasing genetic instability, including duplication and deletion of parts of chromosomes and an increasing proliferative growth advantage of the affected cells. Molecular cytogenetic techniques, such as Comparative genomic hybridization (CGH) and Spectral karyotyping (SKY) [9-11], have allowed evaluation of chromosomal aberrations in HCC. More recently, Crawley [12] has demonstrated the ability of comparative genomic microarray analysis (CGMA) to elucidate alteration of specific genes together with the genetic changes at the chromosome level based on microarray data. Thus, microarray analysis provides an unprecedented opportunity to further the understanding of the etiology and progression of liver cancer.

Bioinformatics methods and tools are essential to analyze and interpret data from microarrays. The critical and urgent task is to associate altered patterns of gene expression with disease. Interpreting microarray data in the context of signaling and regulatory pathways is a particularly effective bioinformatics approach to transform data into biological meaning and to generate hypotheses for further research. Using pathways, disease mechanisms can be interpreted as disturbances of the intricate interconnections among genes, molecules and cells. Most reported pathway analysis of microarray data has examined the role of differentially expressed genes in pathways selected with a priori knowledge. Alternatively, significant pathways can be identified based on statistical analysis, potentially leading to new discoveries and a more complete interpretation of microarray data in the context of biological processes at the mechanistic level.

The primary mechanism for the analysis of HCC is by the administration of carcinogenic agents. A number of model systems have been developed to understand the pathogenesis of primary liver cancer [13-15]. Additionally, the development of transgenic models permit the analysis of the genetic basis for the induction and progression of HCC [16-19]. The albumin-Simian virus 40 (SV40) T antigen transgenic rat contains the mouse albumin-promoter/enhancer linked to the coding region of the SV40 large T antigen (SV 40 tag). SV40 T antigen inactivates both p53 and Rb, resulting in spontaneous development of hepatic neoplasms (adenoma and carcinoma) within 6–9 months. Thus, the Albumin-SV40 T antigen transgenic rat can be used to examine liver cancer development and maintenance [20-22].

In this manuscript, we describe a bioinformatics process where microarray data from the SV40 transgenic rat was examined for application to the study of HCC in human. We first used a novel visualization tool to investigate liver cancer by mapping chromosomal location of differentially expressed genes from the rat model to the chromosomal regions of human orthologs. Then, CGMA analysis was used to relate gene expression bias patterns to cytogenetic aberration profiles on human. Lastly, a statistical approach was used to identify several pathways involved in human HCC based on the rat microarray. The pathway analysis reveals that the expected involvement in apoptosis, cell cycle, growth and differentiation, genetic stability and methionine metabolism are important for cancer development, maintenance and progression. The results indicate that the gene expression profiles of the transgenic rat model may be useful in the study of human liver cancer.


Microarray experiment and results

The details of the microarray experimental procedure is reported elsewhere[21]. Briefly, RNA samples were isolated from the rat liver tissues of six samples, two controls, two adenomas and two carcinomas. The laser capture microdissected samples were amplified prior to microarray hybridization. An NCI cDNA array (IncyteGem2) was used that contains 10238 probes representing 9984 unique genes. Gene expression profiles were produced for all six samples with dye flip, which resulted in a total of 12 arrays.

The log2 ratio-based mean global normalization was first applied and the normalized ratios of the swapped dye labels were then averaged. A total of 9150 genes remained for further analysis after removing non hybridized genes due to low intensity. Significantly differentially expressed genes were determined using an F-test with P < 0 .05.

Data analysis using ArrayTrack

Most analyses were conducted using in-house software, ArrayTrack webcite. ArrayTrack is bioinformatics software, where data management, analysis and interpretation are fully integrated [37]. ArrayTrack consists of three components: (1) MicroarrayDB for storing microarray data; (2) LIBRARY for data interpretation that contains many types of functional information about genes, proteins and pathways; and (3) TOOL that provides functionality for microarray data analysis. LIBRARY contains many sub-libraries and data in these sub-libraries is extracted from different biological databases in public domain (e.g., NCBI bioinformatics resources)[38]. In this project, information for orthology analysis, chromosome-based analysis and pathway analysis was retrieved from LIBRARY. More specifically,

• Gene Orthology Analysis – The human and mouse orthologs to rat were obtained from the Orthologene Library in ArrayTrack. The content of the Orthologene Library are mainly derived from the NCBI HomoloGene database [4,5]. ArrayTrack allows rapid matching of a large number of genes across human, mouse and rat for the gene orthology analysis.

• Chromosome-based analysis – The cytogenetic locations of genes were exported directly from the Gene Library of ArrayTrack. A novel visualization tool, the Chromosome Plot, was developed to study the effect of a gene expression pattern on liver cancer through identifying the altered cytogenetic regions of each chromosome. Figure 1 shows a bar chart depiction with the y-axis giving cytogenetic location along each rat chromosome represented by 20 vertical bars extending along the x-axis. This kind of plot has two uses. It depicts rat genes in their cytogenetic locations on each chromosome using color coding expression information as red for up-regulation, green for down and grey for unaffected genes (e.g., Figure 1). Thus, the plot provides for a specific species a compact visual display of cytogenetic blocks and/or chromosomes altered. Alternatively, the genes can also be mapped to the chromosomal location of another species, and color coded according to the chromosome of the experiment species. (e.g., Figures 2 for rat mapping to mouse and Figure 3 for rat mapping to human).

thumbnailFigure 1. Expressed genes in microarray mapped to the rat cytogenetic location and chromosome. The genes were obtained from an ANOVA analysis among two controls, two adenomas and two carcinomas samples of the transgenic rat. The cytogenetic location of genes is on the y-axis for each of the 20 rat chromosomes that are displayed as separate bars along the x-axis. Red and green areas are the significant genes that are up or down regulated, respectively, and grey represents those genes not differentially expressed.

thumbnailFigure 2. Genes significantly differentially expressed in rat (shown in Figure 1) mapped to the orthologous genes on chromosome of mouse. Different colors denote the corresponding rat chromosome number of the orthologous genes

thumbnailFigure 3. Genes significantly differentially expressed in rat (shown in Figure 1) mapped to the orthologous genes on chromosomes of human. Different colors denote the corresponding rat chromosome number of the orthologous genes.

• Pathway analysis – The pathway data was obtained from the Pathway Library in ArrayTrack. The Pathway Library contains pathways from both the Kyoto Encyclopedia of Genes and Genomes (KEGG)[35] and PathArt (Jubilant Biosys Ltd, Columbia, MD 21045) that can be searched separately or in combination in ArrayTrack. The Fisher Exact Test [39] was used to estimate the statistical significance of pathway i:

Where N is total number of genes on the chip (i.e., 9150), m is the number of differential expressed genes identified using the F-test (i.e., 2223), ni is the number of genes out of N that belong to pathway i while mi is the number of genes out of M differential expressed genes belong to pathway i. The two-sided Fisher's Exact Test p-value less than 0.05, suggest that the probability of significant genes in this pathway is not expected by chance alone.

Comparative Genomic Microarray Analysis (CGMA)

CGMA identifies cytogenetic regions containing unidirectional gene expression biases. The biased regions possibly indicate chromosomal gains and losses [12]. Of the total 9150 genes, GenBank accession numbers (Refseq in NCBI) for human orthologs to rodent genes were obtained for 2925 genes out of 3414 genes with Homologene ID using the Orthologene Library and Gene Library in ArrayTrack; ESTs and genes that may be unique to rodent were excluded. A two-tailed z-statistic was then computed to test whether chromosomal regions exhibited gene-expression biases [12]. CGMA was done for each of two adenoma samples and each of the two carcinoma samples using an on line version of software at webcite. Output human CGMA results contained 2728 genes from an input of 2925 Refseq genes. A Z-statistics of 1.96 corresponds to 95% confidence (that the expression bias in the chromosome was not due to chance) and 2.58 corresponds to 99% confidence.


A total of 2223 differentially expressed genes was identified across three groups of samples (i.e., normal, adenoma and carcinoma) based on an F-test. The differentially expressed genes were first mapped to the rat chromosome. As depicted in Figure 1, the differentially expressed genes primarily occurred in several chromosomes, indicating that these chromosomes were altered in rats with neoplasm compared to normal rats. Specifically, a large number of up-regulated genes mapping in the rat chromosome 1q is consistent with previous findings of high amplification in rat liver cancer [22].

To investigate the cross-species extrapolation based on the results from the transgenic rat, the differentially expressed genes were first mapped to the orthologous chromosomal location of the mouse chromosomes. As depicted in Figure 2, the majority of the differentially expressed genes from the rat 1q that are known to be important for the rat liver cancer development appear mainly on the mouse chromosomes 7, 17, and 19 (displayed as the orange band in Figure 2). A comparison of rat to human shows that the differentially expressed genes from the rat 1q appear primarily on human chromosomes 10, 11 and 19 (Figure 3). The results suggest that the mouse chromosomes (7, 17 and 19) and human chromosomes (10, 11 and 19) might be important in liver cancer for these species. The findings are supported by a number of reports [12,23,24].

Table 1 lists the cytogenetic location of the differentially expressed genes from the rat 1q and location of the orthologous gene in human and mouse. There are seven groups of significantly expressed genes (called gene blocks); genes in each group are consecutive to each other and across species. The genes in the same blocks could be coordinately expressed to perform similar transcriptional programs or physiological processes across species in liver cancer development and maintenance. For example, the human gene blocks 10q24-26, 11p15.5, 11q13-15 and 19q13.2 have corresponding blocks on rat 1q, and corresponding blocks on mouse chromosomes 7 and 19. These blocks are associated with several cancer-related processes and functions, including apoptosis, M phase, cell communication and nuclear division as seen in a statistical analysis based on Gene Ontology (results not shown).

Table 1. Seven blocks of the significant genes from the rat 1q conservated across rat, mouse and human. The log2-transformed expression of average fold change (average over four tumor samples) for each gene is given in column two with direction up or down indicated by the sign, where genes with average fold change greater than an arbitrary +1.87 and less than -1.87 are highlighted.

To further confirm the validity of cross-species extrapolation, we investigated chromosomal aberration in human based on the differentially expressed genes from the rat model using CGMA. Table 2 summarizes the Z statistics for each cancer sample from the CGMA analysis. Chromosomes exhibiting unidirectional bias with at least 95% confidence have the table cells with positive value denoting up-regulation or cells with negative vaule denoting down-regulation. Of 46 chromosomal regions (23 p and 23 q arms), 15 exhibit unidirectional bias in gene expression. Of the 15 affected chromosomal regions, 14 show up-regulation and most of these are associated with adenoma. The CGMA results were further compared with Karyotype results in the Cancer Genome Anatomy Project (CGAP) in the Mitelman Database [25]. Of 15 affected chromosomal regions identified from of rat gene expression data, 10 regions are also reported in CGAP. This is shown in the last column of Table 2 that lists both the number of citations and number of patients in CGAP.

Table 2. Summary of the Z statistics for human chromosomes for each test sample from the CGMA analysis. A Z-statistics of 1.96 corresponds to α = 0.05, or 95% confidence that the expression bias in the chromosome was not due to chance. A Z-statistic of 2.58 corresponds to 99% confidence. Chromosomes exhibiting unidirectional bias with at least 95% confidence have positive sign denoting up-regulation or negative sign for denoting down-regulation.

We also investigated which pathways in human were significantly affected based on the differentially expressed genes identified in the transgenic rat model. Pathway analysis is a particular effective way to examine how the findings in the rat model relate to human in the context of biological functions. Table 3 summarizes the results of the pathways analysis. Fifteen pathways were significantly altered in a Fisher's Exact Test with p < 0.05. They predominately involve cell cycle, cell growth and differentiation. Most identified pathways are confirmed by a large literature to be associated with many cancers types [26,27]. Examples are 1) the p53 pathway involved in response to DNA damage, 2) the Rb pathway involves in the control of cell cycle, and 3) the transforming growth factor-beta (TGF-beta) pathway involved in growth inhibition. In addition, altered methionine metabolism pathway and regulation of P27 during cell cycle progression are known to be critical for cancer progression [28].

Table 3. Pathway analysis of 2223 significant genes using a Fisher's Exact Test identified the listed pathways (p < 0.05) that might be related to human liver cancer.


This study investigates the implications of using microarray results from the albumin-SV40 transgenic rat for the study of human liver cancer. We demonstrated the importance of bioinformatics to interpret microarray data for the cross-species comparisons. Specifically, two in-house bioinformatics tools are of importance for the analysis, the Chromosome Plot and ArrayTrack. The Chromosome Plot not only provides a visual presentation of the gene expression pattern at the level of gene order across chromosomes (e.g., Figure 1), but also can be used to map chromosome and cytogenetic location of differentially expressed genes from one species to another (e.g., Figures 2 and 3). ArrayTrack software that integrates data from public repositories was used to identify the cross-species orthologous genes, their chromosomal locations and, most importantly, the pathways that may be related to liver cancer. In addition, CGMA analysis was performed to investigate the variability of the multiple chromosome aberration patterns based on gene expression data, which is compared with the results presented in CGAP.

Implications of orthologs and chromosome-based analysis

The recently completed sequencing of the rat genome provides a basis for future research to elucidate how differences and commonalities affect the ability of rat models to predict human disease. The rat genome project reported that almost all human genes known to be associated with disease have orthologous genes in the rat genome, and that human, mouse and rat genomes are approximately 90% orthologous [1]. We also analyzed orthologous genes between human, rat and mouse among the 9150 genes on the chip using ArrayTrack. The chip was found in Orthologene Library in ArrayTrack to contain 3414 human, 3365 mouse and 1950 rat genes, with the rest of genes being either EST tags or Riken genes (about 1500). The results showed that 92% of human genes are orthologous to either rat or mouse.

Although a large number of genes was identified to be differentially expressed from the rat model, some of these genes may result from the cancer rather than causally related. In addition, the function of a specific gene and its involvement in cancer might not be conserved across species. Thus, as important as structural and functional homology of specific genes is, the conservation of function of blocks of genes is likely to be more important in cross-species comparison. We found seven distinct blocks of significantly differentially expressed genes within different cytogenetic regions of the rat 1q with homologous chromosomal segments in human and mouse (Table 1). However, human, mouse and rat have very different chromosomal arrangements. The genes in these blocks appear consecutively in contiguous cytogenetic regions, irrespective of species and chromosomal location. This finding is not surprising considering the close evolutionary distance between the species where 278 orthologous segments are reported to be shared between human and rat, and 280 segments are reported to be shared between human and mouse [1].

It is proposed that these seven blocks of genes may be of significance for liver cancer development, maintenance and progression across human, rat, and mouse. For example, genes in the blocks may be coordinately expressed to share transcription programs or to respond to the genomic instability observed in liver cancer. Several genes in Table 1 show large fold changes, and are implicated in cancer development and maintenance. For example, Rps16, Rps19 and Rps3 code for ribosomal proteins, and their altered expression has been associated with liver and other tumors [29,30]. Insulin-like growth factor2 receptor (Igf2r) is mutated in many human HCC tumors and the gene's haploid insufficiency has been suggested as an early event in human hepatocarcinogenesis [31]. Cyclin D1 (Ccnd1) and cyclin-dependent kinase inhibitor 1C (Cdkn1c = p57) are critical for the cell cycle, including G1 progression and G1/S transition. Cyclin D1 has been shown to be amplified in 10 to 20% of HCCs.

Implication of CGMA analysis

Chromosomal aberrations are common in cancers, particularly in advanced stages. CGH has been employed to determine gross DNA gains and losses at chromosomal and sub-chromosomal levels [10]. CGH, however, is time-consuming, and lacks the resolution and sensitivity to detect changes at the gene level; for example, CGH is unable to detect copy number changes within narrow regions of chromosomes (alternation of <1 Mb). It fails to identify putative tumor-suppressor genes or oncogenes [32]. These limitations might be overcome by using CGMA [12]. CGMA identifies cytogenetic regions containing unidirectional gene expression biases. Such region-dependent expression change may be the result of allelic imbalances commonly found in liver and other cancers. Evidence shows that DNA copy number alterations (deletion, low, mid and higher-level amplification) with an average 2-fold change in DNA copy number corresponds about a 1.5-fold change in mRNA level [33]. Therefore, CGMA based on microarray data measuring mRNA level could be related to DNA level.

Using CGMA, we identified 15 out of 46 (23 p and 23 q arms) human chromosomal regions that could be involved in liver cancer development, maintenance and progression. These chromosomal aberrations are consistent with the CGAP report for 10 out of 15 chromosome regions by karyotypes (Table 2). Although CGAP database (Table 2) cites no evidence of involvement of chromosome 19 in human liver cancer, we found that genes in both chromosome 19q and 19p are significantly down-regulated for three out of four tumor samples. In addition, there is also a block of genes in 19q that corresponds to rat 1q (Table 1) while a large number of differentially expressed genes also occurs in 19p using the Chromosome Plot (Figure is not shown). Analysis of both human 19p and q suggests the possible relevance of the chromosome in human liver cancer. The genes significantly altered in rat micorarray corresponding to human 19p13 are JunB, Rab8a (Mel), Tnfsf9 and Dnmt1. Further investigation is required to confirm their relevance to human HCC. Comparing with the findings by Crawley et al [12], we predicted chromosomal gains for five (12q, 16, 17q, 19p and 20q) out of eight of those reported by Crawley who carried out CGMA with human HCC gene expression arrays. Both our analysis and that of Crawley's suggest the importance of 19p in the human liver cancer, a region of aberration not previously discovered with CGH analysis. These results indicate human 19p as a region of aberration not previously discovered with CGH analysis.

Implications of pathway analysis

Pathways are the best vehicle for interpretation of biological functions of genes. An important goal of modern biology is to identify the interaction and regulatory networks among biological molecules. A logical approach is to analyze the gene expression changes in the context of known biological pathways [34,35]. A number of human pathways were found to be significantly altered using the Fisher's Exact Test by comparing the number of genes with altered expression in a pathway to the number of genes on the chip in the same pathway.

We inferred which pathways are involved in human liver cancer from the differentially expressed genes in the transgenic rat liver cancer model. It is important to point out that the statistically significant pathways identified in this process were solely based on the analysis of microarray data together with the Orthologene and Pathway Libraries in ArrayTrack, and thus required no a priori knowledge regarding cancer genes and the pathways that they control. The results of the pathway analysis given in Table 3 include those involved in apoptosis, cell cycle, cell growth and differentiation and others that are significant in liver cancer.

Most of the altered pathways are involved in cell cycle regulation. In cancer, disruption of cell cycle regulation is accomplished by coordinating the activity of cyclin-dependent kinases, checkpoint controls, and DNA repair pathways, which, when perturbed lead to uncontrolled cell growth [27]. Not surprisingly, our studies support that P53 and Rb signaling pathways as well as cyclin mediated pathway including G1-S checkpoints are altered. Both p53 and Rb are tumor-suppressor genes, and their products are transcription factors that respond to a variety of stress signals and are often associated with the progression of neoplastic diseases. The transgenic model implies that both P53 and Rb signaling pathways could be disrupted in the human liver cancer. Without P53 and Rb, cell cycle arrest and/or programmed cell death (apoptosis) are inhibited, leading to accumulation of mutations and genetic instability. Since P53 is deactivated in this transgenic model, we also observed that the pathway that influences Ras and Rho proteins during G1 to S transition is altered. Ras is a proto-oncogene that is involved in multiple signal transduction pathways transmitting pro-proliferative signals to the nucleus, while Rho proteins are members of the extended Ras family that modulate gene expression, cell cycle progression, and cell proliferation and survival.

Categoies of altered pathways are associated with growth and differentiation. Genes in the ECM (extracellular matrix) and Integrin mediated signaling pathway have been reported to be over-expressed in human HCC, though the mechanism is not fully understood [36]. In addition, an excess of TGF-beta is thought to overwhelm the cell in two ways. First, it promotes the overgrowth of blood vessels. Second, excess TGF-beta suppresses T cells and other components of the immune system that would normally attack aberrant cells.

The human-relevant liver cancer pathways based on SV 40 transgenic rat liver model are confirmed by reports on human liver cancer models. Therefore, the pathway analysis using Fisher's Exact Test is novel and efficient.


We presented several bioinformatics approaches to extrapolate microarray data involving rat liver cancer to the human. Microarray has been widely used in many fields of medical and biological research. The current challenge of bioinformatics of microarray is no longer to identify a list of differentially expressed genes, but to develop effective bioinformatics processes and tools for data interpretation and knowledge discovery. In this study, we first developed a Chromosome Plot that provides a compact visual summary of gene expression data at the level of chromosomal location for identification of altered chromosomal regions. This tool facilitates cross-species comparison. The information available in ArrayTrack on gene ontology, gene orthologs and gene pathways was then used to interpret the microarray data. Finally, the CGMA bioinformatics tool was uesd to infer HCC chromosomal aberrations in the human based on microarray data from rat. The important lesson of this study is how to limit the information using bioinformatics resources and statistical means to present an unbiased (or statistical) view to interpret microarray results with respect to genes, pathways, chromosomes and functions. Based on a thorough bioinformatics analysis, we found that the albumin-SV40 transgenic rat is a useful animal model for prediction of human liver cancer.

Authors' contributions

HF did all calculations and analysis using bioinformatics tools, and wrote the first draft manuscript. WT guided the analysis and helped draft the manuscript. YD had the original idea for the liver cancer and liver toxicity study, provided original tissue, and designed the microarray analysis experiment; provided insightful information concerning liver cancer. HCP provided discussion and the transgenic rats. SHY provided the microarray data. JMY coordinated the collaboration and microarray experiment. WT, HH, QX, LS, XC and RP were involved in the discussion of the data analysis. RP and LS assisted with writing the manuscript. All authors read and approved the final manuscript.


We are grateful to Dr. K. A. Furge for help on CGMA analysis and to Brett Thorn for help on Fisher's Exact Test.


  1. Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, et al.: Genome sequence of the Brown Norway rat yields insights into mammalian evolution.

    Nature 2004, 428(6982):493-521. PubMed Abstract | Publisher Full Text OpenURL

  2. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al.: Initial sequencing and analysis of the human genome.

    Nature 2001, 409(6822):860-921. PubMed Abstract | Publisher Full Text OpenURL

  3. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, et al.: Initial sequencing and comparative analysis of the mouse genome.

    Nature 2002, 420(6915):520-562. PubMed Abstract | Publisher Full Text OpenURL

  4. Tatusov RL, Koonin EV, Lipman DJ: A genomic perspective on protein families.

    Science 1997, 278(5338):631-637. PubMed Abstract | Publisher Full Text OpenURL

  5. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al.: The COG database: an updated version includes eukaryotes.

    BMC Bioinformatics 2003, 4(1):41. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  6. Zhou XJ, Gibson G: Cross-species comparison of genome-wide expression patterns.

    Genome Biol 2004, 5(7):232. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  7. Ellwood-Yen K, Graeber TG, Wongvipat J, Iruela-Arispe ML, Zhang J, Matusik R, Thomas GV, Sawyers CL: Myc-driven murine prostate cancer shares molecular features with human prostate tumors.

    Cancer Cell 2003, 4(3):223-238. PubMed Abstract | Publisher Full Text OpenURL

  8. Lee YK, El-Nezami H, Haskard CA, Gratz S, Puong KY, Salminen S, Mykkanen H: Kinetics of adsorption and desorption of aflatoxin B1 by viable and nonviable bacteria.

    J Food Prot 2003, 66(3):426-430. PubMed Abstract OpenURL

  9. Kallioniemi A, Kallioniemi OP, Piper J, Tanner M, Stokke T, Chen L, Smith HS, Pinkel D, Gray JW, Waldman FM: Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization.

    Proc Natl Acad Sci U S A 1994, 91(6):2156-2160. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  10. Kallioniemi OP, Kallioniemi A, Piper J, Isola J, Waldman FM, Gray JW, Pinkel D: Optimizing comparative genomic hybridization for analysis of DNA sequence copy number changes in solid tumors.

    Genes Chromosomes Cancer 1994, 10(4):231-243. PubMed Abstract OpenURL

  11. Schrock E, du Manoir S, Veldman T, Schoell B, Wienberg J, Ferguson-Smith MA, Ning Y, Ledbetter DH, Bar-Am I, Soenksen D, et al.: Multicolor spectral karyotyping of human chromosomes.

    Science 1996, 273(5274):494-497. PubMed Abstract OpenURL

  12. Crawley JJ, Furge KA: Identification of frequent cytogenetic aberrations in hepatocellular carcinoma using gene-expression microarray data.

    Genome Biol 2002, 3(12):RESEARCH0075. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  13. Dragan YP, Sargent L, Xu YD, Xu YH, Pitot HC: The initiation-promotion-progression model of rat hepatocarcinogenesis.

    Proc Soc Exp Biol Med 1993, 202(1):16-24. PubMed Abstract OpenURL

  14. Pitot HC: Altered hepatic foci: their role in murine hepatocarcinogenesis.

    Annu Rev Pharmacol Toxicol 1990, 30:465-500. PubMed Abstract | Publisher Full Text OpenURL

  15. Fausto N: Mouse liver tumorigenesis: models, mechanisms, and relevance to human disease.

    Semin Liver Dis 1999, 19(3):243-252. PubMed Abstract OpenURL

  16. Adams JM, Cory S: Transgenic models of tumor development.

    Science 1991, 254(5035):1161-1167. PubMed Abstract OpenURL

  17. Hanahan D: Dissecting multistep tumorigenesis in transgenic mice.

    Annu Rev Genet 1988, 22:479-519. PubMed Abstract | Publisher Full Text OpenURL

  18. Merlino G: Transgenic mice as models for tumorigenesis.

    Cancer Invest 1994, 12(2):203-213. PubMed Abstract OpenURL

  19. Grisham JW: Interspecies comparison of liver carcinogenesis: implications for cancer risk assessment.

    Carcinogenesis 1997, 18(1):59-81. PubMed Abstract | Publisher Full Text OpenURL

  20. Dragan YP, Sargent LM, Babcock K, Kinunen N, Pitot HC: Alterations in specific gene expression and focal neoplastic growth during spontaneous hepatocarcinogenesis in albumin-SV40 T antigen transgenic rats.

    Mol Carcinog 2004, 40(3):150-159. PubMed Abstract | Publisher Full Text OpenURL

  21. Yim SH, Ward JM, Dragan Y, Yamada A, Scacheri PC, Kimura S, Gonzalez FJ: Microarray analysis using amplified mRNA from laser capture microdissection of microscopic hepatocellular precancerous lesions and frozen hepatocellular carcinomas reveals unique and consistent gene expression profiles.

    Toxicol Pathol 2003, 31(3):295-303. PubMed Abstract | Publisher Full Text OpenURL

  22. Sargent LM, Dragan YP, Sattler G, Xu YH, Wiley J, Pitot HC: Specific chromosomal changes in albumin simian virus 40 T antigen transgenic rat liver neoplasms.

    Cancer Res 1997, 57(16):3451-3456. PubMed Abstract OpenURL

  23. Nesbit MA, Hodges MD, Campbell L, de Meulemeester TM, Alders M, Rodrigues NR, Talbot K, Theodosiou AM, Mannens MA, Nakamura Y, et al.: Genomic organization and chromosomal localization of a member of the MAP kinase phosphatase gene family to human chromosome 11p15.5 and a pseudogene to 10q11.2.

    Genomics 1997, 42(2):284-294. PubMed Abstract | Publisher Full Text OpenURL

  24. Forozan F, Mahlamaki EH, Monni O, Chen Y, Veldman R, Jiang Y, Gooden GC, Ethier SP, Kallioniemi A, Kallioniemi OP: Comparative genomic hybridization analysis of 38 breast cancer cell lines: a basis for interpreting complementary DNA microarray data.

    Cancer Res 2000, 60(16):4519-4525. PubMed Abstract | Publisher Full Text OpenURL

  25. CGAP: Cancer Genome Anotomy Project-CGAP:. [] webcite

  26. Saffroy R, Pham P, Lemoine A, Debuire B: Biologie moleculaire et carcinome hepatocellulaire : donnees actuelles et developpements futurs.

    Ann Biol Clin (Paris) 2004, 62(6):649-656. PubMed Abstract | Publisher Full Text OpenURL

  27. Vogelstein B, Kinzler KW: Cancer genes and the pathways they control.

    Nat Med 2004, 10(8):789-799. PubMed Abstract | Publisher Full Text OpenURL

  28. Cheng J, Imanishi H, Liu W, Nakamura H, Morisaki T, Higashino K, Hada T: Involvement of cell cycle regulatory proteins and MAP kinase signaling pathway in growth inhibition and cell cycle arrest by a selective cyclooxygenase 2 inhibitor, etodolac, in human hepatocellular carcinoma cell lines.

    Cancer Sci 2004, 95(8):666-673. PubMed Abstract | Publisher Full Text OpenURL

  29. Karan D, Kelly DL, Rizzino A, Lin MF, Batra SK: Expression profile of differentially-regulated genes during progression of androgen-independent growth in human prostate cancer cells.

    Carcinogenesis 2002, 23(6):967-975. PubMed Abstract | Publisher Full Text OpenURL

  30. Li B, Sun M, He B, Yu J, Zhang YD, Zhang YL: Identification of differentially expressed genes in human uterine leiomyomas using differential display.

    Cell Res 2002, 12(1):39-45. PubMed Abstract | Publisher Full Text OpenURL

  31. Oka Y, Waterland RA, Killian JK, Nolan CM, Jang HS, Tohara K, Sakaguchi S, Yao T, Iwashita A, Yata Y, et al.: M6P/IGF2R tumor suppressor gene mutated in hepatocellular carcinomas in Japan.

    Hepatology 2002, 35(5):1153-1163. PubMed Abstract | Publisher Full Text OpenURL

  32. Takeo S, Arai H, Kusano N, Harada T, Furuya T, Kawauchi S, Oga A, Hirano T, Yoshida T, Okita K, et al.: Examination of oncogene amplification by genomic DNA microarray in hepatocellular carcinomas: comparison with comparative genomic hybridization analysis.

    Cancer Genet Cytogenet 2001, 130(2):127-132. PubMed Abstract | Publisher Full Text OpenURL

  33. Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown PO: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors.

    Proc Natl Acad Sci U S A 2002, 99(20):12963-12968. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  34. Karchin R, Karplus K, Haussler D: Classifying G-protein coupled receptors with support vector machines.

    Bioinformatics 2002, 18(1):147-159. PubMed Abstract | Publisher Full Text OpenURL

  35. Kanehisa M: The KEGG database.

    Novartis Found Symp 2002, 247:91-101.

    discussion 101–103, 119–128, 244–152

    PubMed Abstract OpenURL

  36. Zhang H, Ozaki I, Mizuta T, Matsuhashi S, Yoshimura T, Hisatomi A, Tadano J, Sakai T, Yamamoto K: Beta 1-integrin protects hepatoma cells from chemotherapy induced apoptosis via a mitogen-activated protein kinase dependent pathway.

    Cancer 2002, 95(4):896-906. PubMed Abstract | Publisher Full Text OpenURL

  37. Tong W, Cao X, Harris S, Sun H, Fang H, Fuscoe J, Harris A, Hong H, Xie Q, Perkins R, et al.: ArrayTrack – supporting toxicogenomic research at the U.S. Food and Drug Administration National Center for Toxicological Research.

    Environ Health Perspect 2003, 111(15):1819-1826. PubMed Abstract | PubMed Central Full Text OpenURL

  38. Tong W, Harris S, Cao X, Fang H, Shi L, Sun H, Fuscoe J, Harris A, Hong H, Xie Q, et al.: Development of public toxicogenomics software for microarray data management and analysis.

    Mutat Res 2004, 549(1–2):241-253. PubMed Abstract | Publisher Full Text OpenURL

  39. Zeeberg BR, Feng W, Wang G, Wang MD, Fojo AT, Sunshine M, Narasimhan S, Kane DW, Reinhold WC, Lababidi S, et al.: GoMiner: a resource for biological interpretation of genomic and proteomic data.

    Genome Biol 2003, 4(4):R28. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL