Email updates

Keep up to date with the latest news and content from BMC Cancer and BioMed Central.

Open Access Highly Accessed Research article

Transcription profiles of non-immortalized breast cancer cell lines

Mariana Fernandez-Cobo1, James F Holland1 and Beatriz GT Pogo12*

Author Affiliations

1 Department of Medicine, Mount Sinai School of Medicine, New York, New York, USA

2 Department of Microbiology, Mount Sinai School of Medicine, New York, New York, USA

For all author emails, please log on.

BMC Cancer 2006, 6:99  doi:10.1186/1471-2407-6-99

The electronic version of this article is the complete one and can be found online at:

Received:16 May 2005
Accepted:20 April 2006
Published:20 April 2006

© 2006 Fernandez-Cobo et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



Searches for differentially expressed genes in tumours have made extensive use of array technology. Most samples have been obtained from tumour biopsies or from established tumour-derived cell lines. Here we compare cultures of non-immortalized breast cancer cells, normal non-immortalized breast cells and immortalized normal and breast cancer cells to identify which elements of a defined set of well-known cancer-related genes are differentially expressed.


Cultures of cells from pleural effusions or ascitic fluids from breast cancer patients (MSSMs) were used in addition to commercially-available normal breast epithelial cells (HMECs), established breast cancer cell lines (T-est) and established normal breast cells (N-est). The Atlas Human Cancer 1.2 cDNA expression array was employed. The data obtained were analysed using widely-available statistical and clustering software and further validated through real-time PCR.


According to Significance Analysis of Microarray (SAM) and AtlasImage software, 48 genes differed at least 2-fold in adjusted intensities between HMECs and MSSMs (p < 0.01). Some of these genes have already been directly linked with breast cancer, metastasis and malignant progression, whilst others encode receptors linked to signal transduction pathways or are otherwise related to cell proliferation. Fifty genes showed at least a 2.5-fold difference between MSSMs and T-est cells according to AtlasImage, 2-fold according to SAM. Most of these classified as genes related to metabolism and cell communication.


The expression profiles of 1176 genes were determined in finite life-span cultures of metastatic breast cancer cells and of normal breast cells. Significant differences were detected between the finite life-span breast cancer cell cultures and the established breast cancer cell lines. These data suggest caution in extrapolating information from established lines for application to clinical cancer research.


The search for differentially expressed genes in tumours has made extensive use of array technology. Most studies have involved tumour biopsy samples or established tumour-derived cell lines [1]. Differentially expressed genes may help to identify tumours with high metastatic potential and pathways that might be therapeutic targets.

As noted by Dairkee et al. [2], one drawback of using established cell lines is that the process of immortalization in vitro can bias the expression profile when compared to native tissues. Another problem is that a number of widely-used cell lines have sub-lines that differ in their karyotypes and RNA expression levels [3-5]. Biopsies from tumours, unless they are microdissected, contain heterogeneous cell types. A molecular signature (or expression profile) of the whole tumour can be misleading since it is a composite of all the cells, normal and malignant, present in it. Although tumours may consist of several populations of cells that deviate slightly from the clonal strain of origin, metastases may involve only the subline(s) with specific genetic changes that favour metastatic behaviour. Cultures from freshly-derived ascites or pleural effusions may therefore be relatively homogeneous before they acquire the further mutations that immortalize cell lines with unlimited growth potential.

In this communication we report which genes, within a defined set of well-known cancer-related genes, are differentially expressed in freshly-derived breast cancer cell cultures compared with primary normal, established normal and breast cancer cultures. We compared expression patterns in cultures of finite life-span cells from pleural effusions or ascites of breast cancer patients (Mount Sinai School of Medicine: MSSM) with commercially-available finite life-span normal breast epithelial cells (HMECs). To ensure comparability, cultures were used at similar passage numbers and the karyotypes were analyzed. We also compared the expression profiles of some of the most widely-used established breast cancer cell lines and three putatively normal established breast cell lines.


Cultures of finite life-span cells

a): HMECs: Different batches of normal human mammary epithelial cells cultures were obtained (Biowhittaker Molecular Applications Inc., NJ). HMEC specimens were enumerated serially and maintained with MEGM (Clonetics, MD) supplemented with 2.5% FCS (Gibco Invitrogen, CA).

b): MSSMs: Breast cancer cells obtained from ascitic fluids or pleural effusions of patients with breast cancer were seeded and maintained in our laboratory. They were designated MSSM 3 through MSSM 14 and maintained in MEGM supplemented with 2.5% of the corresponding original fluid when available and 2.5% FCS (Gibco Invitrogen, CA). The full characterization of these cultures will be published elsewhere (manuscript in preparation).

HMECs were passaged only twice or thrice in our laboratory. MSSMs were harvested at the 5th or 6th passage after initial plating.

Cultures of established cell lines

a): Normal-established (N-est.): the cell lines used were derived from normal tissues: MTSV1-7 (a gift from J. Taylor-Papadimitriou), MCF10A (from R Mira y Lopez) and MCF10F (ATCC, MD).

b): Tumour-established (T-est.): cell lines established from breast cancer pleural effusions or tumour tissues: MDA-MB231, MDA-MB453, MDA-MB468, two strains of MCF 7 (designated MCF 7N and MCF 7P), T47D, BT20 and BT474 (all from ATCC, MD). They were maintained in medium and supplements as recommended by ATCC.

All cell cultures were harvested 48–72 h after plating at about 80% confluence. Sources and markers are given in Table 1.

Table 1. Cell origin and RNA expression of some frequently used markers


The Atlas Human Cancer 1.2 cDNA expression array (Clontech, CA) is a nylon membrane printed with 200–600 bp fragments of 1176 characterized genes involved in cancer, 9 housekeeping genes and 6 negative controls. RNAs were extracted and labelled with the Atlas pure total RNA labelling system and hybridized according to the manufacturer's instructions.

All the cell lines used for the arrays (9 HMECs, 10 MSSMs, 3 N-est and 7 T-est) were probed twice in separate assays. The accuracy of the duplicates was assessed by Pearson's correlation coefficient based on the adjusted intensities of all the genes spotted on the membrane, which ranged from 0.93 to 0.99.

Hybridizations with 30 μg of total RNA were performed according to the manufacturer's instructions. The hybridized membranes were exposed to a phosphorimager screen and were read at 100 μm resolution in a Storm Phosphorimaging system (Molecular Dynamics, CA). The scanned images were transformed to TIFF files with a PC bit order and then aligned and analyzed using AtlasImage 2.01 software (Clontech, CA). To average or compare the samples, the adjusted intensity signal was normalized using the global normalization mode featured in the software. We report only (a) those genes with significant (p < 0.01) differential expression of 2-fold or more, and (b) genes that were undefined for all the cell lines belonging to one type of sample, but were detected in other types with a difference at least equivalent to one background (540 units in intensity). (Undefined genes are those for which the intensity was below the signal threshold).

The AtlasImage software compares only two samples at a time. When we used it to determine the differences between cell classes, we first averaged the cell lines in the four classes (HMEC, MSSM, N-est and T-est) and then performed the comparisons as instructed by the manufacturer.

Statistical analyses (correlation and two-tailed p values) were performed using Microsoft Excel 2000. Further analyses were performed with Significance Analysis of Microarrays (SAM) [6], Prediction Analysis for Microarrays (PAM) (Stanford Univ., USA), FatiGO [7], Pomelo tool and SVM (Bioinformatics unit, CNIO, Spain [8]) and GoMiner [9].

Quantitative real-time PCR (Q-PCR)

To validate the results of the cDNA array experiments, some of the genes found to be differentially expressed were further examined by real-time PCR in 10 HMECs, 9 MSSMs, 3 N-est and 8 T-est cell lines. Five μg of total RNA (corresponding to about 100 ng of mRNA) were reverse-transcribed with oligo(dT) (SuperScript II system, Invitrogen, CA) in a 20 μl reaction volume, and after 125-fold dilution, 1.25 μl were used for a 40-cycle PCR on an ABI Prism 7900 thermal cycler. The reaction was carried out in a 384-well plate with a QuantiTect SYBR Green PCR kit (Qiagen Inc, CA) at an annealing temperature of 63°C and detection at 2–5°C below the Tm of the product as determined from its dissociation curve. Product size was confirmed by agarose gel electrophoresis. The efficiency of each pair of primers for amplification was determined and expression of each gene relative to G3PDH was assessed by the program Qgene [10]. Primers were designed using the program PrimerQuest or Primer3, unless otherwise stated. Primer sequences, lengths and Tms of the products are given in 4.

Additional File 1. Fig S1.doc: Pearson's correlation coefficient between cell line expression profiles

Format: DOC Size: 56KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

Additional File 2. Fig S2.doc: Additional Plots and list of significant genes generated by PAM.

Format: DOC Size: 129KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

Additional File 3. Fig S3: SVG (Adobe scalable vector graphics). Go Miner DAG (directed acyclical graph) view of the changed genes within the GO hierarchy.

Format: SVG Size: 734KB Download fileOpen Data

Additional File 4. Table S1:.doc Primer sequences and PCR-product details.

Format: DOC Size: 29KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

Samples were tested twice in triplicate. Pearson correlation coefficients for the duplicate Q-PCR results ranged between 0.89 and 0.99.


Overall gene expression and class prediction

All the MSSM cultures displayed similar growth rates and had no or minimal chromosomal changes (data not shown). Table 1 shows the main characteristics of these cells.

Each cell line was probed twice. Averages of the duplicates after normalization of the adjusted intensities (as described in the Atlas manual and briefly in Materials and Methods) were used to obtain the expression values for further analysis. The overall gene expression profile, as determined by Pearson's correlation coefficient, discriminates between HMECs and cells derived from metastasis of breast cancer (MSSMs). As seen in Table 2 (and detailed in 1) the correlation within cell classes was > 0.90; the correlation between cell classes was < 0.82.

Table 2. Correlation between cell lines.

Hybridization signals of 928 genes (78.9%) were represented in at least three cell lines. (One class, N-est, comprised three cell lines; the other three classes, HMEC, MSSM and T-est, comprised more.) We used this subset of 928 genes, plus the 9 control genes, to construct a classification model using two prediction programs: PAM and SVM. Since these algorithms do not work well when the numbers of members differ among classes, we trained the SVM program using comparable numbers of lines from three classes (8 HMECs, 9 MSSMs and 7 T-est) and treated the three N-est cell lines and the remaining HMECs and MSSMs as unknowns to be classified. The model thus generated had a classification accuracy of 100% (24/24) by the leave-one-out cross-validation method. The prediction for the three N-est lines was T-est, suggesting that these cells have an expression profile resembling those of the other established cell lines rather than the cultures of finite life-span cells. The remaining HMEC and MSSM cell cultures were correctly assigned to their respective classes (see 5).

Additional File 5. Table S2:.doc: Support Vector Machines: Validations and predictions.

Format: DOC Size: 80KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

The same analysis with PAM, applying a threshold of 3.5, which gives the maximum number of significant genes yielding no misclassification error upon cross-validation, predicted MTSV1-7 and MCF 10F as members of the "T-est" class and MCF 10A as an HMEC-class member with a probability of 1. Again, HMEC 13 and MSSM 14 were properly assigned (Fig 1 and 2). Further to elucidate the classification of MCF10A, which is the most widely-used putatively normal breast cell line in array analysis, we examined the expression values of KT14, KT8/18 and KT19 (Fig. 2A and 2B) and CD104. The MCF10A values were similar to those of the HMECs.

thumbnailFigure 1. PAM (Prediction Analysis of Microarrays). This is a statistical technique for class prediction from gene expression data using nearest shrunken centroids. This method identifies the subsets of genes that best characterize each class. a) Plot of the shrunkencentroids for the specified threshold with the ranked list of significant genes that are used for the predictions. b) Prediction and probabilities of the test cell lines (MTSV1-7, MCF10A, MCF10F, HMEC 13 and MSSM 14).

thumbnailFigure 2. Expression of KT14, KT18 and KT19. a) Plot showsthe correlation of the adjusted intensity (AI) signal of the pair KT8/KT18 for each cell line. b) AI of KT14, KT18 and KT19 of each cell line and its relative percentage.

Comparison of non-immortalized normal and cancer breast cells

Of the 1176 genes in the array, 862 (73.3%) gave hybridization signals in at least three of the finite life-span cell lines used. Of these, 123 (14.3%) showed differential expression when HMEC and MSSM cells were analyzed using the AtlasImage software (ratio >2 and difference in adjusted intensity > 540), and 101 were deemed significant genes with SAM (q-value: 0.7298, median #FDR: 0.73718) (6).

Additional File 6. Table S3:.doc: SAM output.

Format: DOC Size: 175KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

Of these 123 from Atlas and 101 from SAM, 75 genes were differentially expressed with at least a 2-fold change according to SAM and AtlasImage, a minimum difference in intensity of 540 units and a p-value < 0.01 (Excel). These two programs use different algorithms to calculate the ratio (AtlasImage) and fold (SAM) values, so the outputs are not exactly the same. Fig. 3 shows the relative expressions of these 75 genes in all the cell lines analyzed by the arrays using the software Pomelo Tool (FDR and p-values for each gene in 7).

thumbnailFigure 3. Relative expression of all the 29 cell linestested by Atlas arrays. Output of the Pomelo tool showing the log2 of the relative expression of the selected genes in each cell line to the average expression of all the cell lines. The 75 genes shown here were considered differentially expressed by SAM and by AtlasImage, with at least a 2-fold change for both programs, when we compared the HMEC and MSSM cell classes.

Additional File 7. Table S3:.doc: False discovery rates and p-values of the Pomelo Tool applied to the comparison HMEC-MSSM.

Format: DOC Size: 66KB Download file

This file can be viewed with: Microsoft Word ViewerOpen Data

Some genes showed a broad range of expression values among cell lines belonging to the same group: one or two individual cell lines over-expressed the gene (>5 fold the average of the cell class), while others in the same class gave no signal or were barely above background. Since we were looking for genes that could enable us to differentiate among types of cell lines, and hence be useful as markers for each class, we inspected the values of these genes manually. The aim was to exclude from the 75 genes in Fig 3 any that gave extreme values in 2 or more cell lines within a class; "extreme values" were those that lay in the range of the other class, or biased the average of the class in question to generate the required 2-fold change. Twenty seven up-regulated and 21 down-regulated genes remained, in addition to the cytokeratins, which are not included in Tables 3 and 4 but shown in Fig 2. Their distribution according to GO (Gene Ontology) terms by FatiGO at level 3 of biological processes is shown. These changes in gene expression stress the importance in the malignant process of both a diminished capacity for cell-cell adhesion and the remodelling of the extracellular matrix (ECM). For example: genes involved in adhesion and downstream signalling (int α4/α6, P-cadherin (CDH3), γ catenin (JUP)) or inhibitors of remodelling (SPINT-2) were down-regulated, and ECM remodelling enzymes (MMP11, TIMP2 and the cascade TIMP1-SPARC-TGFBI) were up-regulated, as shown in Tables 3 and 4 (SVG file and p-values of GOMiner in Additional files 3 and 8 respectively).

Table 3. Gene expression of MSSM cells in relation to HMEC cells Up regulated Genes in MSSM cells

Table 4. Gene expression of MSSM cells in relation to HMEC cells Down regulated Genes in MSSM cells.

Additional File 8. Table S5:.xls: p-values of Go Miner.

Format: XLS Size: 235KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Comparison of immortal and non-immortal breast cancer cells

Fifty genes showed significant >2-fold (according to SAM software) and >2.5-fold (according to Atlas) differences between the MSSMs and the established breast cancer cell lines. Twenty-five were up-regulated and 25 down-regulated in the established breast cancer cell lines compared to the finite life-span cultures. Most of the differentially expressed genes can be classified under the GO terms "metabolism" and "cell communication". In particular: most of the down-regulated genes seem to be related to remodelling of the extracellular matrix, cell adhesion and receptor-linked signal transduction, while the up-regulated genes are related to general signal transduction pathways and cell proliferation (Tables 5 and 6).

Table 5. Gene expression of T-est cells in relation to MSSM cells 25 Significant Up-regulated Genes in T-est cells

Table 6. Gene expression of T-est cells in relation to MSSM cells 25 Significant Down-regulated Genes in T-est cells

Real-time PCR

To validate the differences seen in the arrays, some genes were tested by real time PCR. We selected several genes of which the expression was significantly increased or decreased, according to SAM and Atlas, for both the MSSM/HMEC and T-est/MSSM comparisons: 14-3-3σ (also called SFN), SPINT2, FES, SPARC, BIGH3 (or TGFBI), TIMP1, TIMP2, MMP11 and DAB2. Also, we analyzed some genes that were deemed significant for only one of the comparisons: NOTCH1, PLAU, CDA and SERPINB2 (or PAI2), and a few genes that were non-significant but somehow related to some of the aforementioned genes: tPA, PAI1, uPAR, DCK.

It has been reported that correlation between Q-PCR and array data is highly variable [11]. It depends, in part, on the sensitivity of arrays in detecting genes with low expression levels or saturation due to very high ones. The genes we have tested showed correlations that ranged from 0.563 to 0.959. For the comparison between the normal (HMEC) and the tumour (MSSM) finite life-span cultures, the Q-PCR results for the manually curated genes (as explained above) supported the findings of the arrays in 17/17 (100%) of cases. In the T-est/MSSM evaluation, agreement between the two techniques was found in 15/17 (88.2%) of the cases (Fig. 4 and data available upon request).

thumbnailFigure 4. Relative expression of some genes validated byQ-PCR. a) Mean normalized expression (MNE) of SPARC. b) MNE of BIGH3 = TGBFI. c) MNE of PAI-1. d) MNE of NOTCH1 in HMECs and MSSMs cells. e) Relative expression, of MSSMs and established cell lines, toMCF10A. Note the difference between the two MCF7s. f) MNE of CYR61. g) MNE of DAB-2. For the complete list of validated genes, see Table S1.

One of the genes excluded from the HMEC/MSSM comparison was CYR61, which was tested by Q-PCR. Even though the overall up-regulation value for this gene was consistent (SAM: 3.53, Atlas: 3.48, Q-PCR: 3.29), its expression levels crossed over to the values of the other cell class in 4/17 cell lines, thus invalidating CYR61 as a reliable marker by itself.


Established breast cancer cell lines have been widely used to study signal transduction pathways, test new pharmaceuticals and determine expression profiles that might predict the metastatic capabilities of primary tumours. In many cases MCF10A has been chosen as the "normal" control, even though this cell line has been reported to possess markers for both myoepithelial and luminal phenotypes [12].

One of the controversies about the use of HMECs as controls is the probable myoepithelial origin of these cells [13] based on expression of KT14 and CD104 (ITGB4) [14,15]. The expression levels of these genes in the MCF10 cells were equivalent to those in the HMECs. Furthermore, the prediction of PAM, based on the 58 genes that discriminate between the different classes of cells used in this study, was that MCF10A was similar to HMECs. Hence, MCF10A cells would have the same limitation as controls as the HMECs. Nevertheless, HMEC cells have the advantage of not being immortalized and pooling them combines the genetic backgrounds of a large cohort.

Both classes of non-immortalized cell lines, HMEC and MSSM, are more homogeneous than the established ones (N-est and T-est), as seen in Table 2 and 1. This is probably due to the cumulative effects of the mutations accrued individually by the established cell lines during successive passages and immortalization.

As seen in Fig 3, the expression of some genes in a particular class is clearly different from the others, e.g. up-regulation of FES, MMP11, DAB2 and down-regulation of SPINT2, SFN, JUP for the MSSM cells lines. Others are more distinctive of a "state", e.g. distribution of cytokeratins and down-regulation of certain integrins (ITGA6/ITGB4, ITGA7) in all the tumour-derived cell lines. Therefore, the expression levels of these genes can be seen as specific attributes of certain classes. Their combined expression defines a profile that can be used to construct a model similar to those built by SVM and PAM for predicting the classification of an unknown cell line accurately, as shown in fig 1B.

We tested only two sub-lines of MCF-7, well-known to be highly variable [3-5], and a single sub-line of each of the other established cell lines. We recognize that other sub-lines might be different.

When the MSSMs are compared with the HMECs, several genes display differential expression in a mode consistent with previous publications, where they have been shown to be significant for malignant progression or metastasis: down-regulation of the tumour suppressor and inhibitor of mitotic phase entry 14-3-3σ (SFN) [16,17], the serine protease inhibitor SPINT2 [18] and JUP (γ-catenin) [19]; and up-regulation of FES [20] and SPARC [21]. In addition, there is the "cadherin switching" (CDH3 = P-cadherin to CDH2 = N-cadherin) and its relationship to FGFR1 and MMP9 [22,23].

SPARC expression was detected in 17/17 human breast tumour biopsies and to a lesser extent in some established cell lines [21]. It has also been associated with malignant progression and invasive potential in breast cancer [24,25], and together with MMP11 in colorectal [26] and oesophageal cancer [27]. Its over-expression increases motility and invasion [28] and induces growth inhibition [29] in established breast cancer cell lines. In addition, it has been shown to induce expression of BIGH3 and PAI-1[30]. MSSM cells showed up-regulation of SPARC, BIGH3, PAI1 and MMP11, while the T-est cells showed down-regulation of SPARC, BIGH3 and PAI1 (MDA-MB231 is an exception for the latter gene) (Fig 4A-B-C).

We also found genes of which the behaviour did not fully agree with previous descriptions, i.e. NOTCH1, CYR61 and DAB2. The many and varied functions of Notch signalling, achieved through activation or down-regulation, have been recently reviewed [31]. In the MSSM samples, NOTCH1 and its ligand JAG2 are down-regulated. In this case, this pathway is more likely to function as a tumour suppressor than an oncoprotein. This conclusion would be less clear if we had only compared established cell lines (mainly MCF10A with both MCF7s) (Fig 4D–E). CYR61 is a pro-angiogenic, secreted protein encoded by a growth factor-inducible immediate-early gene. It is over-expressed in some invasive established breast cancer cell lines and in 30–36% of primary tumours [32,33]. In this study we found only three cell lines with a truly high over-expression (MSSM6, MDA-MB231 and BT20) and five with a moderate over-expression compared to the mean expression in HMECs (between 2 and 4 fold). The same conclusion would have applied if we had considered MCF10A as a control (Fig 4F). DAB2 is considered a tumour suppressor since its expression is down-regulated in ovarian carcinomas and in some established breast cancer cell lines [34], and up-regulated during megakaryocyte differentiation [35]. Its continued expression in tumour cells led to growth inhibition or cell death [36] unless the cells were in contact with some type of basement membrane [37]. MSSM cells showed up-regulation of this gene (Fig 4G), perhaps because they grew as an attached cell line or because of their finite life-span phenotype in vitro.

Epithelial-mesenchymal transition (EMT) is considered a mechanism for carcinoma progression and metastasis, and the expression of vimentin (VIM) is its main marker. This view has now been extended to include whole pathways and a more complex relationship with the microenvironment of the cell [38,39]. Among the other genes regulated during EMT [40], the MSSM cells showed up-regulation of COL6A1, SPARC, CDH2 and DAB2 and down-regulation of JUP and BTG2 (Tables 3 and 4).


Using arrays, we have studied 10 finite lifespan breast cancer cell lines freshly isolated from metastatic pleural or peritoneal fluids, 9 finite lifespan normal breast cell lines, 7 established breast cancer cell lines, and 3 established normal breast cell lines. We tested 1176 genes considered to be related to cancer. Within each cell class there was significant homogeneity of gene expression. Two clusters of genes distinguished the MSSMs from the HMECs. These 48 genes, which were differentially expressed by at least 2 fold, concerned cell-cell interactions and remodelling of the extracellular matrix. Fifty genes that were differentially expressed at least 2 fold between MSSMs and established breast cancer cell lines are generally considered to be involved in cell communication and metabolism. Established breast cancer cell lines have been used to model biochemical and pharmacological responses in human breast cancer; the differences from freshly isolated breast cancer lines imply they are not wholly satisfactory models.


ER = oestrogen receptor, FCS = foetal calf serum, FDR = false discovery rate, KT = keratin, MGB = mammoglobin, SVM = support vector machines, VIM = vimentin.

Competing interests

The author(s) declare that they have no competing interests.

Authors' contributions

MFC designed and performed all the experimental work and drafted the paper. JFH provided fundamental advice on the clinical aspects of the project and participated in the final writing of the paper. BGTP has directed the entire project and participated in the final writing of the paper.


Grant support: The T.J. Martell Foundation for Leukemia, Cancer and AIDS Research, The Jane Grinberg Memorial Fund, the Kash Family Foundation, and the Ellen Block Memorial Fund.

We thank Elizabeth Carpenter and Jan Eubig for technical assistance and Dr John Mandelli and Dr Linda Rolnitzky for their help with statistics and Julia Roboz for her excellent editorial work.


  1. Lacroix M, Leclercq G: Relevance of breast cancer cell lines as models for breast tumours: an update.

    Breast Cancer Res Treat 2004, 83(3):249-89. PubMed Abstract | Publisher Full Text OpenURL

  2. Dairkee SH, Ji Y, Ben Y, Moore DH, Meng Z, Jeffrey SS: A molecular "signature" of primary breast cancer cultures; patterns resembling tumor tissue.

    BMC Genomics 2004, 5:57. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  3. Hiorns LR, Bradshaw TD, Skelton LA, Yu Q, Kelland LR, Leyland-Jones B: Variation in RNA expression and genomic DNA content acquired during cell culture.

    Br J Cancer 2004, 90:476-482. PubMed Abstract | Publisher Full Text OpenURL

  4. Bahia H, Ashman JNE, Cawkwell L, Lind M, Monson JRT, Drew PJ, Greenman J: Karyotypic variation between independently cultured strains of the cell line MCF-7 identified by multicolour fluorescence in situ hybridization.

    Int J Oncol 2002, 20:489-494. PubMed Abstract OpenURL

  5. Nugoli M, Chuchana P, Vendrell J, Orsetti B, Ursule L, Nguyen C, Birnbaum D, Douzery E, Cohen P, Theillet C: Genetic variability in MCF-7 sublines: evidence of rapid genomic and RNA expression profile modifications.

    BMC Cancer 2003, 3:13. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  6. Tusher V, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response.

    Proc Natl Acad Sci USA 2001, 98:5116-5121. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  7. Al-Shahrour F, Díaz-Uriarte R, Dopazo J: FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes.

    Bioinformatics 2004, 20:578-580. PubMed Abstract | Publisher Full Text OpenURL

  8. Herrero J, Al-Shahrour F, Díaz-Uriarte R, Mateos Á, Vaquerizas JM, Santoyo J, Dopazo J: GEPAS, a web-based resource for microarray gene expression data analysis.

    Nucleic Acids Res 2003, 31:3461-3467. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  9. Zeeberg BR, Feng W, Wang G, Wang MD, Fojo AT, Sunshine M, Narasimhan S, Kane DW, Reinhold WC, Lababidi S, Bussey KJ, Riss J, Barrett JC, Weinstein JN: GoMiner: A Resource for Biological Interpretation of Genomic and Proteomic Data.

    Genome Biol 2003, 4:R28. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  10. Muller P, Janovjak H, Miserez A, Dobbie Z: Processing of gene expression data generated by quantitative real-time RT-PCR.

    Biotechniques 2002, 32:1372-4, 1376, 1378–9. PubMed Abstract OpenURL

  11. Etienne W, Meyer M, Peppers J, Meyer RJ: Comparison of mRNA gene expression by RT-PCR and DNA microarray.

    Biotechniques 2004, 36:618-20. PubMed Abstract OpenURL

  12. Gordon L, Mulligan K, Maxwell-Jones H, Adams M, Walker R, Jones J: Breast cell invasive potential relates to the myoepithelial phenotype.

    Int J Cancer 2003, 106:8-16. PubMed Abstract | Publisher Full Text OpenURL

  13. Flintoft L: Establishing normality.

    Nat Rev Cancer 2004, 4:418. Publisher Full Text OpenURL

  14. Dairkee S, Puett L, Hackett A: Expression of basal and luminal epithelium-specific keratins in normal, benign, and malignant breast tissue.

    J Natl Cancer Inst 1988, 80:691-5. PubMed Abstract OpenURL

  15. Jones C, Mackay A, Grigoriadis A, Cossu A, Reis-Filho JS, Fulford L, Dexter T, Davies S, Bulmer K, Ford E, Parry S, Budroni M, Palmieri G, Neville AM, O'Hare MJ, Lakhani SR: Expression profiling of purified normal human luminal and myoepithelial breast cells: identification of novel prognostic markers for breast cancer.

    Cancer Res 2004, 64:3037-45. PubMed Abstract | Publisher Full Text OpenURL

  16. Ferguson AT, Evron E, Umbricht CB, Pandita TK, Chan TA, Hermeking H, Marks JR, Lambers AR, Futreal PA, Stampfer MR, Sukumar S: High frequency of hypermethylation at the 14-3-3 sigma locus leads to gene silencing in breast cancer.

    Proc Natl Acad Sci USA 2000, 97:6049-54. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  17. Vercoutter-Edouart A, Lemoine J, Le Bourhis X, Louis H, Boilly B, Nurcombe V, Revillion F, Peyrat J, Hondermarck H: Proteomic analysis reveals that 14-3-3sigma is down-regulated in human breast cancer cells.

    Cancer Res 2001, 61:76-80. PubMed Abstract | Publisher Full Text OpenURL

  18. Kobayashi H, Suzuki M, Y Hirashima, Terao T: The protease inhibitor bikunin, a novel anti-metastatic agent.

    Biol Chem 2003, 384:749-54. PubMed Abstract | Publisher Full Text OpenURL

  19. Bukholm I, Nesland J, Borresen-Dale A: Re-expression of E-cadherin, alpha-catenin and beta-catenin, but not of gamma-catenin, in metastatic tissue from breast cancer patients.

    J Pathol 2000, 190:15-9. PubMed Abstract | Publisher Full Text OpenURL

  20. Scheijen B, Griffin J: Tyrosine kinase oncogenes in normal hematopoiesis and hematological disease.

    Oncogene 2002, 21:3314-33. PubMed Abstract | Publisher Full Text OpenURL

  21. Graham J, Balleine R, Milliken J, Bilous A, Clarke C: Expression of osteonectin mRNA in human breast tumours is inversely correlated with oestrogen receptor content.

    Eur J Cancer 1997, 33:1654-60. PubMed Abstract | Publisher Full Text OpenURL

  22. Hazan RB, Qiao R, Keren R, Badano I, Suyama K: Cadherin Switch in Tumor Progression.

    Ann NY Acad Sci 2004, 1014:155-163. PubMed Abstract | Publisher Full Text OpenURL

  23. Suyama K, Shapiro I, Guttman M, Hazan RB: A signaling pathway leading to metastasis is controlled by N-cadherin and the FGF receptor.

    Cancer Cell 2002, 2:301-314. PubMed Abstract | Publisher Full Text OpenURL

  24. Porter P, Sage E, Lane T, Funk S, Gown A: Distribution of SPARC in normal and neoplastic human tissue.

    J Histochem Cytochem 1995, 43:791-800. PubMed Abstract OpenURL

  25. Bellahcene A, Castronovo V: Expression of bone matrix proteins in human breast cancer: potential roles in microcalcification formation and in the genesis of bone metastases.

    Bull Cancer 1997, 84:17-24. PubMed Abstract OpenURL

  26. Porte H, Chastre E, Prevot S, Nordlinger B, S Empereur, P Basset, P Chambon, C Gespach: Neoplastic progression of human colorectal cancer is associated with overexpression of the stromelysin-3 and BM-40/SPARC genes.

    Int J Cancer 1995, 64:70-5. PubMed Abstract OpenURL

  27. Porte H, Triboulet JP, Kotelevets L, Carrat F, Prevot S, Nordlinger B, DiGioia Y, Wurtz A, Comoglio P, Gespach C, Chastre E: Overexpression of stromelysin-3, BM-40/SPARC, and MET genes in human esophageal carcinoma: implications for prognosis.

    Clin Cancer Res 1998, 4:1375-82. PubMed Abstract OpenURL

  28. Briggs J, Chamboredon S, Castellazzi M, Kerry J, Bos T: Transcriptional upregulation of SPARC, in response to c-Jun overexpression, contributes to increased motility and invasion of MCF7 breast cancer cells.

    Oncogene 2002, 21:7077-91. PubMed Abstract | Publisher Full Text OpenURL

  29. Dhanesuan N, Sharp J, Blick T, Price J, Thompson E: Doxycycline-inducible expression of SPARC/Osteonectin/BM40 in MDA-MB-231 human breast cancer cells results in growth inhibition.

    Breast Cancer Res Treat 2002, 75:73-85. PubMed Abstract | Publisher Full Text OpenURL

  30. Golembieski W, Rempel S: cDNA array analysis of SPARC-modulated changes in glioma gene expression.

    J Neurooncol 2002, 60:213-26. PubMed Abstract | Publisher Full Text OpenURL

  31. Weng A, Aster J: Multiple niches for Notch in cancer: context is everything.

    Curr Opin Genet Dev 2004, 14:48-54. PubMed Abstract | Publisher Full Text OpenURL

  32. Tsai M, Hornby A, Lakins J, Lupu R: Expression and function of CYR61, an angiogenic factor, in breast cancer cell lines and tumor biopsies.

    Cancer Res 2000, 60:5603-7. PubMed Abstract | Publisher Full Text OpenURL

  33. Xie D, Miller C, O'Kelly J, Nakachi K, Sakashita A, Said J, Gornbein J, Koeffler H: Breast cancer. Cyr61 is overexpressed, estrogen-inducible, and associated with more advanced disease.

    J Biol Chem 2001, 276:14187-94. PubMed Abstract | Publisher Full Text OpenURL

  34. Fazili Z, Sun W, Mittelstaedt S, Cohen C, Xu X: Disabled-2 inactivation is an early step in ovarian tumorigenicity.

    Oncogene 1999, 18:3104-13. PubMed Abstract | Publisher Full Text OpenURL

  35. Tseng C, Huang C, Tseng C, Lin M, Hsieh J, Tseng C: Induction of disabled-2 gene during megakaryocyte differentiation of K562 cells.

    Biochem Biophys Res Commun 2001, 285:129-35. PubMed Abstract | Publisher Full Text OpenURL

  36. Wang S, Makino K, Xia W, Kim J, Im S, Peng H, Mok S, Singletary S, Hung M: DOC-2/hDab-2 inhibits ILK activity and induces anoikis in breast cancer cells through an Akt-independent pathway.

    Oncogene 2001, 20:6960-4. PubMed Abstract | Publisher Full Text OpenURL

  37. Sheng Z, Sun W, Smith E, Cohen C, Sheng Z, Xu X: Restoration of positioning control following Disabled-2 expression in ovarian and breast tumor cells.

    Oncogene 2000, 19:4847-54. PubMed Abstract | Publisher Full Text OpenURL

  38. Thiery J: Epithelial-mesenchymal transitions in development and pathologies.

    Curr Opin Cell Biol 2003, 15:740-6. PubMed Abstract | Publisher Full Text OpenURL

  39. Petersen O, Lind Nielsen H, Gudjonsson T, Villadsen R, Ronnov-Jessen L, Bissell M: The plasticity of human breast carcinoma cells is more than epithelial to mesenchymal conversion.

    Breast Cancer Res 2001, 3:213-7. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  40. Jechlinger M, Grunert S, Tamir I, Janda E, Ludemann S, Waerner T, Seither P, Weith A, Beug H, Kraut N: Expression profiling of epithelial plasticity in tumor progression.

    Oncogene 2003, 22:7155-69. PubMed Abstract | Publisher Full Text OpenURL

Pre-publication history

The pre-publication history for this paper can be accessed here: