<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2003-4-9-117</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Opinion</dochead>
      <bibl>
         <title>
            <p>Comparing protein abundance and mRNA expression levels on a genomic scale</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Greenbaum</snm>
               <fnm>Dov</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A2">
               <snm>Colangelo</snm>
               <fnm>Christopher</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
            </au>
            <au id="A3" ca="yes">
               <snm>Williams</snm>
               <fnm>Kenneth</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>Kenneth.Williams@yale.edu</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Gerstein</snm>
               <fnm>Mark</fnm>
               <insr iid="I2"/>
               <insr iid="I4"/>
               <email>Mark.Gerstein@yale.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Genetics, Yale University, New Haven, CT 06520-8114, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA</p>
            </ins>
            <ins id="I3">
               <p>HHMI Biopolymer Laboratory and W. M. Keck Foundation Biotechnology Resource Laboratory, Yale University, New Haven, CT 06520-8114, USA</p>
            </ins>
            <ins id="I4">
               <p>Department of Computer Science, Yale University, New Haven, CT 06520-8114, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2003</pubdate>
         <volume>4</volume>
         <issue>9</issue>
         <fpage>117</fpage>
         <url>http://genomebiology.com/2003/4/9/117</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/gb-2003-4-9-117</pubid>
               <pubid idtype="pmpid">12952525</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>29</day>
               <month>8</month>
               <year>2003</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2003</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Comparing protein abundance and mRNA expression levels on a
genomic scale</p>
      </shorttitle>
      <shortabs>
         <p>We review the results of attempts to correlate protein abundance with mRNA expression levels, focusing on yeast. </p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Attempts to correlate protein abundance with mRNA expression levels have had variable success. We review the results of these comparisons, focusing on yeast. In the process, we survey experimental techniques for determining protein abundance, principally two-dimensional gel electrophoresis and mass-spectrometry. We also merge many of the available yeast protein-abundance datasets, using the resulting larger 'meta-dataset' to find correlations between protein and mRNA expression, both globally and within smaller categories.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>Although some of the underlying technology for quantifying protein abundance was introduced almost thirty years ago <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>, there has recently been a significant increase in the development of new tools. Concurrently, tools for analyzing mRNA expression are becoming more mainstream. The quantification of both of these molecular populations is not an exercise in redundancy; measurements taken from mRNA and protein levels are complementary and both are necessary for a complete understanding of how the cell works <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Additionally, as mRNA is eventually translated into protein, one might assume that there should be some sort of correlation between the level of mRNA and that of protein. Alternatively, there may not be any significant correlation, which, in itself, is an informative conclusion.</p>
         <p>The two commonly used high-throughput methods for measuring mRNA expression, microarrays and Affymetrix chips, have both been extensively reviewed elsewhere <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. There are also two basic methods for determining protein abundance; either based on two-dimensional electrophoresis or on mass-spectrometric methods (Table <tblr tid="T1">1</tblr>). We provide a brief review of these technologies and recent efforts to determine correlations between quantified protein abundances and mRNA expression.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Overview of selected protein profiling technologies</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c ca="left">
                     <p>Technology</p>
                  </c>
                  <c ca="left">
                     <p>Type of labeling required</p>
                  </c>
                  <c ca="left">
                     <p>Ability to detect many post-translational modifications</p>
                  </c>
                  <c ca="left">
                     <p>Biomolecules that are optimally quantified</p>
                  </c>
                  <c ca="left">
                     <p>Approximate dynamic range (and reference)</p>
                  </c>
                  <c ca="left">
                     <p>Number of proteins/spots quantified (and reference)</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Two-dimensional gel electrophoresis</p>
                  </c>
                  <c ca="left">
                     <p>Silver staining</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Naturally occurring forms of proteins larger than 10 kDa</p>
                  </c>
                  <c ca="left">
                     <p>10 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                  </c>
                  <c ca="left">
                     <p>1,500 <abbrgrp><abbr bid="B8">8</abbr></abbrgrp></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Differential two-dimensional fluorescence gel electrophoresis (DIGE)</p>
                  </c>
                  <c ca="left">
                     <p><it>In vitro </it>with Cy2, Cy3 or CY5 fluorophores at primary amines</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Naturally occurring forms of proteins larger than 10 kDa</p>
                  </c>
                  <c ca="left">
                     <p>10,000 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp></p>
                  </c>
                  <c ca="left">
                     <p>1,100 <abbrgrp><abbr bid="B51">51</abbr></abbrgrp></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SELDI- or MALDI-MS disease biomarker discovery</p>
                  </c>
                  <c ca="left">
                     <p>None</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Naturally occurring forms of proteins smaller than 10 kDa</p>
                  </c>
                  <c ca="left">
                     <p>25</p>
                  </c>
                  <c ca="left">
                     <p>Not applicable</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Isotope-coded affinity tag (ICAT) - LC/MS</p>
                  </c>
                  <c ca="left">
                     <p><it>In vitro </it>with H<sup>1</sup>/D or C<sup>12</sup>/C<sup>13 </sup>ICAT reagent at cysteine</p>
                  </c>
                  <c ca="left">
                     <p>No</p>
                  </c>
                  <c ca="left">
                     <p>Cysteine-containing tryptic peptides from digests of protein extracts</p>
                  </c>
                  <c ca="left">
                     <p>10,000*</p>
                  </c>
                  <c ca="left">
                     <p>496 <abbrgrp><abbr bid="B18">18</abbr></abbrgrp></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>N<sup>14</sup>/N<sup>15 </sup>- LC/MS</p>
                  </c>
                  <c ca="left">
                     <p><it>In vivo </it>at nitrogens in amino acids</p>
                  </c>
                  <c ca="left">
                     <p>Yes</p>
                  </c>
                  <c ca="left">
                     <p>Tryptic peptides from digests of protein extracts</p>
                  </c>
                  <c ca="left">
                     <p>10,000 <abbrgrp><abbr bid="B19">19</abbr></abbrgrp></p>
                  </c>
                  <c ca="left">
                     <p>872 <abbrgrp><abbr bid="B20">20</abbr></abbrgrp></p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>*Assumed to be similar to that for multidimensional protein identification. Abbreviations: SELDI-MS, surface-enhanced laser desorption ionization mass spectrometry; MALDI-MS, matrix-assisted laser desorption ionization mass spectrometry; LC/MS, liquid chromatography and mass spectrometry.</p>
            </tblfn>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Methods for determining protein levels</p>
         </st>
         <sec>
            <st>
               <p>Two-dimensional electrophoresis</p>
            </st>
            <p>Determining relative protein expression levels by conventional two-dimensional electrophoresis requires isoelectric focusing, SDS-polyacrylamide gel electrophoresis, staining, fixing, densitometry, and careful matching of the same spots on two or more gels. Differentially expressed spots are then excised and enzymatically digested, and the resulting peptides are identified using mass spectrometry. An attractive aspect of this approach is the low capital equipment cost, but a high level of expertise is needed to obtain reproducible gels, and two-dimensional electrophoresis is generally limited to proteins that are neither too acidic, too basic, nor too hydrophobic, and that are between 10 and 200 kDa in size, so that they are reliably separated on gels. Additionally, this approach detects only those proteins that are expressed at relatively high levels and that have long half-lives <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. In one study using 40 &#956;g yeast lysate, the average protein abundance detected was 51,200 copies per cell, with no proteins detected with abundances less than 1,000 copies per cell <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Given that 1,500 spots were resolved on a 1.0 pH unit gel <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, several gels covering different pH ranges would be needed to resolve a whole cell lysate. Given these limitations, conventional two-dimensional electrophoresis technology has limited potential for large-scale proteome analysis <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
            <p>Two-dimensional fluorescence-difference gel electrophoresis (DIGE) utilizes mass- and charge-matched, spectrally resolvable fluorescent dyes (such as Cy3 and Cy5) to label two different protein samples <it>in vitro </it>prior to two-dimensional electrophoresis. Its main advantage over conventional two-dimensional electrophoresis is that both the control and the experimental sample are run in a single polyacrylamide gel. The samples are then imaged separately but can be perfectly overlaid without any 'warping' of the gels. This substantially raises the confidence with which protein changes between samples can be detected and quantified. Changes in the relative level of expression of a protein may be detected that are as little as 1.2-fold for large-volume spots <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Because detection is based on fluorescence, DIGE has a large dynamic range of about 10,000, which permits differential expression analysis of proteins that are present at relatively low copy number <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. The limit of detection of DIGE for quantifying protein expression ratios is between 0.25 and 0.95 ng protein, which is similar to that for silver staining <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. In a recent study <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, the relative levels of expression of approximately 1,050 protein spots were compared in 250,000 laser-dissected normal versus esophageal carcinoma cells. This analysis identified 58 spots that were up-regulated by more than three-fold and 107 that were down-regulated by more than three-fold in cancer cells.</p>
         </sec>
         <sec>
            <st>
               <p>Mass spectrometric approaches</p>
            </st>
            <sec>
               <st>
                  <p>Disease biomarker discovery</p>
               </st>
               <p>Current approaches to discovering protein or peptide markers of disease involve batch chromatography, matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS) and statistical analysis of large numbers of disease versus normal serum or other biological samples. Most recent studies have relied on surface-enhanced laser desorption ionization time-of-flight mass spectrometry (SELDI-TOF-MS) <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. The SELDI approach <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> involves using a gold-coated chip with eight or sixteen 2 mm spots that are modified with chromatographic surfaces (for example anionic, cationic, hydrophobic, and so on). After spotting a few microliters of serum, any contaminants and salt are removed by washing with water, and the target is dried by adding a MALDI matrix solution, such as &#945;-cyano-4-hydroxy-cinnamic acid. In a study by Petricoin <it>et al. </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp> SELDI-MS analysis of serum from 50 control and 50 case samples from patients with ovarian cancer resulted in identifying five peptide biomarkers that ranged in size from 534 to 2,465 Da. The pattern formed by these markers was then used to correctly classify all 50 ovarian cancer samples in a masked set of serum samples from 116 patients who included 50 patients with ovarian cancer and 66 unaffected women. Similar promising results have been reported in studies of serum samples from breast and prostrate cancer patients <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B15">15</abbr></abbrgrp>. In a recent study <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, which compared the relative ability of several different statistical approaches to classify samples based on MS data, the disease biomarker approach was extended to a conventional MALDI-MS platform. Although powerful, the disease biomarker approach does not provide accurate relative amounts of the control versus experimental biomarker, only the relative intensity difference.</p>
            </sec>
            <sec>
               <st>
                  <p>Isotope-coded affinity-tag-based protein profiling</p>
               </st>
               <p>While both MALDI-MS-based disease biomarker discovery and DIGE comparatively profile the naturally occurring forms of peptides and proteins, isotope-coded affinity-tag (ICAT) analysis profiles the relative amounts of cysteine-containing peptides derived from tryptic digests of protein extracts. Because only a single tryptic peptide is needed to quantify the expression of the corresponding parent protein, the ICAT reagent utilizes a thiol protein-reactive group that attaches both a biotin tag and either nine <sup>12</sup>C (light) or nine <sup>13</sup>C (heavy) atoms to each cysteine residue. Following derivatization of the control protein extract with [<sup>12</sup>C]-ICAT reagent and the experimental extract with [<sup>13</sup>C]-ICAT reagent, the pooled samples are subjected to trypsin digestion followed by both cation and avidin chromatography. Liquid chromatography and tandem mass spectrometry (LC/MS/MS) is then used to identify ICAT peptide pairs and to quantify the relative <sup>12</sup>C/<sup>13</sup>C ratios. It is important to note that the ICAT approach provides the relative expression ratios of individual proteins under two conditions; it does not provide absolute protein concentrations, nor does it provide the ratio of the concentration of one protein relative to another in a single condition. A nice feature of this approach is that the <it>in vitro </it>incorporation of a stable isotope into one of the two samples being compared obviates the need to separately analyze the control and experimental samples by MS. Although a tryptic digest of a whole-cell human protein extract might produce more than 500,000 peptides, less than 100,000 of these might be expected to contain cysteine, but based on a search of the SwissProt database <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, less than 5% of human proteins lack cysteine and would therefore be missed (that is, more than 95% of proteins would include at least one cysteine-containing peptide).</p>
               <p>ICAT results are analogous to those obtained by the use of two different fluorescent dyes in DNA microarray analysis of mRNA levels or DIGE analysis of protein expression. The largest number of proteins profiled so far using this approach with a single sample are the 491 proteins contained in microsomal fractions of naive and <it>in vitro </it>differentiated human myeloid leukemia cells <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Multidimensional protein identification technology</p>
               </st>
               <p>Multidimensional protein identification technology (MudPit) is similar to ICAT in that it utilizes cation-exchange prefractionation followed by reverse-phase (RP) high-performance liquid chromatography (HPLC) separation and MS/MS analysis <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. In contrast to the ICAT approach, however, MudPit analyzes the entire mixture of tryptically digested proteins and utilizes tandemly coupled (cation-exchange followed by reverse-phase) columns. A specific subset of peptides is eluted from the cation-exchange column, using a step gradient of increasing salt concentration, onto the front of the RP column. Peptides are then eluted from the RP column and enter the mass spectrometer for analysis. After the RP gradient is complete, the next step of the salt gradient releases another subset of peptides from the cation-exchange column onto the RP column, and the process repeats itself. Using this approach on the yeast proteome, Wolters <it>et al. </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp> identified 5,540 unique peptides from 1,484 proteins and demonstrated a dynamic range of detection of 10,000-fold. This method has been extended to comparative protein profiling by using <it>in vivo </it><sup>14</sup>N/<sup>15</sup>N metabolic labeling <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Washburn <it>et al. </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp> used <it>Saccharomyces cerevisiae </it>grown in both <sup>14</sup>N- and <sup>15</sup>N-containing minimal media, and 2,264 peptides and 872 proteins were uniquely identified. Also, accurate <sup>14</sup>N/<sup>15</sup>N quantitation was determined for each peptide with an average standard deviation of 30%.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Comparison of mRNA and protein levels</p>
            </st>
            <p>Even with the significant developments in the technologies used to quantify protein abundance over the past couple of years, protein identification and quantification still lags behind the high-throughput experimental techniques used to determine mRNA expression levels. Yet, while mRNA expression values have shown their usefulness in a broad range of applications, including the diagnosis and classification of cancers <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>, these results are almost certainly only correlative, rather than causative; in the end it is most probably the concentration of proteins and their interactions that are the true causative forces in the cell, and it is the corresponding protein quantities that we ought to be studying.</p>
            <p>Primarily because of a limited ability to measure protein abundances, researchers have tried to find correlations between mRNA and the limited protein expression data, in the hope that they could determine protein abundance levels from the more copious and technically easier mRNA experiments. Alternatively, if there is definitively no correlation between mRNA and protein data, both quantities could be used as independent sources of information for use in machine-learning algorithms, for example, to predict protein interactions. To date, there have been only a handful of efforts to find correlations between mRNA and protein expression levels, most notably in human cancers and yeast cells; for the most part, they have reported only minimal and/or limited correlations.</p>
            <p>One of the earliest analyses of correlation looked at 19 proteins in the human liver. Anderson and Seilhamer <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> found a somewhat positive correlation of 0.48. Another limited analysis, of the three genes <it>MMP-2, MMP-9 </it>and <it>TIMP-1 </it>in human prostate cancers, showed no significant relationship <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. An additional cancer study <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> showed a significant correlation in only a small subset of the proteins studied. Conversely, Orntoft <it>et al. </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp> found highly significant correlations in human carcinomas when looking at changes in mRNA and protein expression levels.</p>
         </sec>
         <sec>
            <st>
               <p>Protein and mRNA correlations in yeast</p>
            </st>
            <p>Many of the present efforts at correlating mRNA and protein expression have been conducted in yeast using two-dimensional electrophoresis techniques. In particular, Gygi <it>et al. </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> found that even similar mRNA expression levels could be accompanied by a wide range (up to 20-fold difference) of protein abundance levels, and <it>vice versa. </it>These results contrast with those of Futcher <it>et al. </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, who found relatively high correlations (<it>r </it>= 0.76) after transforming the data to normal distributions. In a previous analysis <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, we merged the data from both of these datasets (referred to as 2DE-1 <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and 2DE-2 <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>), comparing the resulting new larger protein abundance set ('merged data-set 1') with a comprehensive mRNA expression dataset. The mRNA expression reference set was constructed through iteratively combining, in a non-trivial fashion, three sets that used Affymetrix chips and a SAGE dataset <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Using these reference datasets, we were able to do an all-against-all comparison of mRNA and protein expression levels, in addition to a number of analyses comparing protein and mRNA expression using smaller, but broad categories <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>.</p>
            <p>Given the difficult, laborious, and limiting nature of two-dimensional electrophoresis analysis, many of the newer protein abundance determinations have been done using MudPit and derivative technologies. Washburn <it>et al. </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp> used MudPit to analyze and detect 1,484 arbitrary proteins: they were able to detect a somewhat random sampling of proteins independent of abundance, localization, size or hydrophobicity (we refer to this dataset as MudPit-1). In a further experiment the authors, comparing expression ratios for both proteins and mRNA levels, found that although they could not find correlations for individual loci, they could find overall correlations when looking at pathways and complexes of proteins that functioned together <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Peng <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp> analyzed 1,504 yeast proteins with a false-positive rate - misidentification of a protein - of less than 1% (we refer to this dataset as MudPit-2). In their analysis <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, they contrasted their methodology with that of Washburn <it>et al. </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp> with which there was significant overlap of proteins.</p>
            <sec>
               <st>
                  <p>A new merged dataset</p>
               </st>
               <p>Expanding upon our previous merged dataset, we constructed a new merged dataset (merged data set-2) using the two two-dimensional electrophoresis and two MudPit datasets described above. Succinctly (more information is available on our website at <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>), we transformed each of the protein-abundance datasets into more quantitative data by fitting each protein dataset individually onto the reference mRNA expression dataset. The MudPit-1 dataset was also fitted onto the more finely grained MudPit-2 dataset. Each of the new, fitted datasets was then inversely transformed back into protein space. These derived protein datasets were then combined into a larger reference dataset; when we had more than one abundance value for an open reading frame (ORF), we chose the value from the dataset according to a prescribed quality ranking (see Figure <figr fid="F1">1</figr>). The resulting set contained protein abundance information for approximately 2,000 ORFs. (One caveat with the MudPit data: while quantitative analysis can be subsequently done on the results of MudPit experiments, MudPit data alone are only semi-quantitative, in that the number of peptides determined is relative to the actual protein abundance within the cell <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Some may therefore argue that MudPit alone is not optimal for a comparison with mRNA data. Nevertheless, we feel that our methodical merging process creates a quantitative and representative dataset that can be compared with the mRNA expression data.) Using the resulting data we could compare mRNA expression and protein abundance globally (Figure <figr fid="F1">1a</figr>) as well as looking at smaller, broad categories, such as function or localization (see Figure <figr fid="F1">1b,1c</figr>). In particular, we show that some localization categories - for example, the nucleolus - have significantly higher correlations than the global correlation. Other localizations may present less of a correlation between mRNA and protein data - for example, the mitochondria - possibly reflecting the heterogeneous nature and function of the latter organelle. In terms of MIPS functional categories <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>, we show that although some categories, such as cell rescue, show a lower correlation than the whole merged set, other functional categories, such as cell cycle, show a significant increase in correlation. Logically, this increased correlation reflects the co-regulated nature of the proteins in this functional category.</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Comparison of mRNA expression and protein abundance</p>
                  </caption>
                  <text>
                     <p>Comparison of mRNA expression and protein abundance. <b>(a) </b>A plot comparing our mRNA reference expression set <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> with our newly compiled protein abundance dataset. The mRNA axis is in copies per cell; the protein axis is in thousand copies per cell. The protein dataset is the result of iteratively fitting two MudPit datasets (MudPit-1 <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> and MudPit-2 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>) and two two-dimensional electrophoresis datasets (2DE-1 <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and 2DE-2 <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>). Given the semi-quantitative nature of the MudPit data <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, we transformed the data into a more quantitative set by fitting each set individually onto our reference mRNA expression dataset. In addition, we fit the MudPit-1 dataset onto the more finely-grained MudPit-2 dataset. Each of the datasets was then moved back into 'protein space' using an inverse transformation derived from the 2DE-1 set, as this set has the most precise values. These datasets were then combined into the new reference abundance dataset. In cases in which there were overlapping values for a given ORF we used the dataset in accord with the following ordering: 2DE-1, 2DE-2, MudPit-2, MudPit-1. The resulting reference protein abundance dataset (<it>N </it>= 2044) had a correlation of 0.66 with the mRNA reference dataset. <b>(b,c) </b>Additionally, we show that when looking at specific subsets (subcellular localization <abbrgrp><abbr bid="B52">52</abbr></abbrgrp> or functional groups <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>) we can find both higher and lower correlations amongst these groups. The lower correlations are generally reflective of a more heterogeneous category. This analysis indicates that while correlations may be weak when looking at the global data, we tend to find higher correlations when looking at smaller well-defined subsets of ORFs. Further analysis is available at <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
                  </text>
                  <graphic file="gb-2003-4-9-117-1"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Reasons for the absence of correlation</p>
            </st>
            <p>There are presumably at least three reasons for the poor correlations generally reported in the literature between the level of mRNA and the level of protein, and these may not be mutually exclusive. First, there are many complicated and varied post-transcriptional mechanisms involved in turning mRNA into protein that are not yet sufficiently well defined to be able to compute protein concentrations from mRNA; second, proteins may differ substantially in their <it>in vivo </it>half lives; and/or third, there is a significant amount of error and noise in both protein and mRNA experiments that limit our ability to get a clear picture <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>.</p>
            <p>Examining the first option - that there are a number of complex steps between transcription and translation - we looked at correlations between mRNA and protein abundance for those ORFs that had varied or steady levels of mRNA expression over the course of the cell cycle <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. To normalize for the varied degrees of expression for different ORFs, we took the standard deviation divided by the average expression level as representative of the variation of each ORF over the course of the yeast cell cycle (Figure <figr fid="F2">2</figr>). Broadly speaking, the cell can control the levels of protein atthe transcriptional level and/or at the translational level. Logically, we would assume that those ORFs that show a large degree of variation in their expression are controlled at the transcriptional level - the variability of the mRNA expression is indicative of the cell controlling mRNA expression at different points of the cell cycle to achieve the resulting and desired protein levels. Thus we would expect, and we found, a high degree of correlation (<it>r </it>= 0.89) between the reference mRNA and protein levels for these particular ORFs; the cell has already put significant energy into dictating the final level of protein through tightly controlling the mRNA expression, and we assume that there would then be minimal control at the protein level. In contrast, those genes that show minimal variation in their mRNA expression throughout the cell cycle are more likely to have little or no correlation with the final protein level; the cell would be controlling these ORFs at the translational and/or post-translational level, with the mRNA levels being somewhat independent of the final protein concentration. And indeed, we found only minimal correlation between protein and mRNA expression for these ORFs (<it>r </it>= 0.2).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The differences in correlation between mRNA and protein expression values using novel categories</p>
               </caption>
               <text>
                  <p>The differences in correlation between mRNA and protein expression values using novel categories. We see significant differences when looking at the highest and lowest ranking of groups of ORFs in the following categories: occupancy, CAI (codon adaptation index) value <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp> and variability. Occupancy refers to the percentage of transcripts associated with ribosomes; we compared the correlation between the top 100 ORFs and the bottom 100 in terms of occupancy (<it>r </it>= 0.78 versus 0.30). For the CAI, we compared the correlation between mRNA and protein for those ORFs with the highest CAI and those with the lowest (<it>r </it>= 0.48 versus 0.02). Variability refers to the normalized standard deviation (that is, the standard deviation divided by the average expression level) for all ORFs in the cell-cycle expression dataset of Cho <it>et al</it>. <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Here, we compared the correlations between protein abundance and mRNA expression for the most variable compared with the least variable proteins (<it>r </it>= 0.89 versus 0.20). We found significant differences between the correlations of mRNA and protein levels for the top and bottom ranking populations for each of the comparisons.</p>
               </text>
               <graphic file="gb-2003-4-9-117-2"/>
            </fig>
            <p>Furthermore, we found that those ORFs that have higher than average levels of ribosomal occupancy - that is that a large percentage of their cellular mRNA concentration is associated with ribosomes (being translated) - have well correlated mRNA and protein expression levels (Figure <figr fid="F2">2</figr>). These cases probably represent a situation wherein the cell, having significantly controlled the mRNA expression to produce a specific level of protein, will probably not also employ mechanisms to control the translation. Alternatively, those proteins that have very low occupancy rates have uncorrelated mRNA and protein expression; thus, given that the cell has not tightly controlled the mRNA expression for this ORF, it will dictate the resulting protein levels through rigorous controls of its translation (that is, through tight limits on occupancy) <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>.</p>
            <p>A second option for a general lack of correlation between mRNA and protein abundance may be that proteins have very different half-lives as the result of varied protein synthesis and degradation. Protein turnover can vary significantly depending on a number of different conditions <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>; the cell can control the rates of degradation or synthesis for a given protein, and there is significant heterogeneity even within proteins that have similar functions <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. Recent efforts have been made to computationally measure these rates <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>.</p>
            <p>Simplistically, it can be presumed that the change in a protein's concentration over time will be equal to the rate of translation minus the rate of degradation. By analogy to concepts in chemical kinetics, we can approximate this equation: d<it>P</it>(<it>i,t</it>)/d<it>t </it>= <it>SE</it>(<it>i,t</it>) - <it>DP</it>(<it>i,t</it>), where <it>P </it>is protein abundance <it>i </it>at time <it>t, E </it>is the mRNA expression level of protein <it>P, S </it>is a general rate of protein synthesis per mRNA, and <it>D </it>is a general rate of protein degradation per protein <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Additionally there are some experimental methods that can also be used to measure turnover and the translational control of protein levels <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>.</p>
            <p>Given the degenerate nature of the genetic code, there are many synonymous codons (codons that translate into the same amino acid). As the cell is biased in its usage of synonymous codons - that is, the usage of a subset of codons results in a higher level of mRNA expression, possibly as a result of differing cellular tRNA levels <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> - the codon adaptation index (CAI), a measurement of codon usage, can be used to predict the expression of a gene <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> (we recently calculated new parameters for this model, with some improvement in predictive strength <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>). It is thought that the CAI will correlate differently with mRNA levels than with protein abundance levels due, in part, to protein turnover rates <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Ranking the ORFs in terms of their CAI value, we found that although those ORFs that ranked the highest in terms of CAI did not show a very strong correlation between mRNA and protein levels, they nevertheless showed a significantly higher correlation than ORFs that were ranked as having the lower CAI values (<it>r </it>= 0.48 versus 0.02). The low correlations reflect the fact that the CAI will correlate differently for protein and mRNA values because of the additional cellular controls on protein translation, namely the effect of protein turnover rates. Nevertheless, the sizable difference in correlations between the two groups of ORFs with high- and low-ranking CAI values (Figure <figr fid="F2">2</figr>) shows that there is some relationship between mRNA and protein values, possibly indicating that highly expressed genes tend to result in a more correlated level of protein abundance than lower expressed ones.</p>
            <p>Correlations have been found between the mRNA expression levels of different protein subunits within protein complexes <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. This implies that there should be, in general, a correlation between mRNA and protein abundance, as these subunits provide a special case as they have to be available in stoichiometric amounts of proteins for the complexes to function. Thus, we believe that a major limitation to finding correlations is the degree of natural and manufactured systematic noise in mRNA and protein expression experiments. There is a continued effort to both describe and reduce this noise <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. Meanwhile, in an attempt to get around the noise one could look at broad categories of proteins - for example, groups defined by function, structure, or localization - such that the background noise is cancelled out to some degree <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>.</p>
            <p>Although proteomics is still in its infancy, given the pace of technological advancement in protein quantification, mRNA expression analysis and noise reduction, more comprehensive correlation studies will soon be feasible. This will allow for more robust analyses of the relationship between mRNA expression and protein abundance values. Finally, to be fully able to understand the relationship between mRNA and protein abundances, the dynamic processes involved in protein synthesis and degradation have to be better understood; is the protein level changing because of a change in the rate of protein synthesis, or mRNA, or protein turnover? These questions need to be looked into further before we can appreciate in full the relationship between mRNA and protein abundance levels.</p>
         </sec>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This project was funded in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, under contract No. N01-HV-28186.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>High resolution two-dimensional electrophoresis of proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>O'Farrell</snm>
                  <fnm>PH</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1975</pubdate>
            <volume>250</volume>
            <fpage>4007</fpage>
            <lpage>4021</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">236308</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. A novel approach to testing for induced point mutations in mammals.</p>
            </title>
            <aug>
               <au>
                  <snm>Klose</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Humangenetik</source>
            <pubdate>1975</pubdate>
            <volume>26</volume>
            <fpage>231</fpage>
            <lpage>243</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1093965</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Proteomics: theoretical and experimental considerations.</p>
            </title>
            <aug>
               <au>
                  <snm>Hatzimanikatis</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Choe</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>KH</fnm>
               </au>
            </aug>
            <source>Biotechnol Prog</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>312</fpage>
            <lpage>318</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bp990004b</pubid>
                  <pubid idtype="pmpid" link="fulltext">10356248</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Microarrays: biotechnology's discovery platform for functional genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Schena</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Heller</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Theriault</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Konrad</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lachenmeier</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
            </aug>
            <source>Trends Biotechnol</source>
            <pubdate>1998</pubdate>
            <volume>16</volume>
            <fpage>301</fpage>
            <lpage>306</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0167-7799(98)01219-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">9675914</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>High-density genechip oligonucleotide probe arrays.</p>
            </title>
            <aug>
               <au>
                  <snm>McGall</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Christians</snm>
                  <fnm>FC</fnm>
               </au>
            </aug>
            <source>Adv Biochem Eng Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>77</volume>
            <fpage>21</fpage>
            <lpage>42</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12227735</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Exploring the new world of the genome with DNA microarrays.</p>
            </title>
            <aug>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <fpage>33</fpage>
            <lpage>37</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/4462</pubid>
                  <pubid idtype="pmpid" link="fulltext">9915498</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Correlation between protein and mRNA abundance in yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Gygi</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Rochon</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Franza</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1999</pubdate>
            <volume>19</volume>
            <fpage>1720</fpage>
            <lpage>1730</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">83965</pubid>
                  <pubid idtype="pmpid" link="fulltext">10022859</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology.</p>
            </title>
            <aug>
               <au>
                  <snm>Gygi</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Corthals</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Rochon</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>9390</fpage>
            <lpage>9395</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">16874</pubid>
                  <pubid idtype="pmpid" link="fulltext">10920198</pubid>
                  <pubid idtype="doi">10.1073/pnas.160270797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Validation and development of fluorescence two-dimensional differential gel electrophoresis proteomics technology.</p>
            </title>
            <aug>
               <au>
                  <snm>Tonge</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shaw</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Middleton</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rowlinson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rayner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pognan</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Currie</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Davison</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2001</pubdate>
            <volume>1</volume>
            <fpage>377</fpage>
            <lpage>396</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1615-9861(200103)1:3&lt;377::AID-PROT377>3.3.CO;2-Y</pubid>
                  <pubid idtype="pmpid">11680884</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Evaluation of two-dimensional differential gel electrophoresis for proteomic expression analysis of a model breast cancer cell system.</p>
            </title>
            <aug>
               <au>
                  <snm>Gharbi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gaffney</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zvelebil</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Cramer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Waterfield</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Timms</snm>
                  <fnm>JF</fnm>
               </au>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>91</fpage>
            <lpage>98</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.T100007-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12096126</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>2D differential in-gel electrophoresis for the identification of esophageal scans cell cancer-specific protein markers.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhou</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>DeCamp</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gong</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Flaig</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gillespie</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>PR</fnm>
               </au>
               <etal/>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>117</fpage>
            <lpage>124</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.M100015-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12096129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Proteomic approaches to biomarker discovery in prostate and bladder cancers.</p>
            </title>
            <aug>
               <au>
                  <snm>Adam</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Vlahou</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Semmes</snm>
                  <fnm>OJ</fnm>
               </au>
               <au>
                  <snm>Wright</snm>
                  <fnm>GL</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2001</pubdate>
            <volume>1</volume>
            <fpage>1264</fpage>
            <lpage>1270</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1615-9861(200110)1:10&lt;1264::AID-PROT1264>3.3.CO;2-I</pubid>
                  <pubid idtype="pmpid">11721637</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The SELDI-TOF MS approach to proteomics: protein profiling and biomarker identification.</p>
            </title>
            <aug>
               <au>
                  <snm>Issaq</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Veenstra</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Conrads</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Felschow</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Biochem Biophys Res Commun</source>
            <pubdate>2002</pubdate>
            <volume>292</volume>
            <fpage>587</fpage>
            <lpage>592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/bbrc.2002.6678</pubid>
                  <pubid idtype="pmpid" link="fulltext">11922607</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Use of proteomic patterns in serum to identify ovarian cancer.</p>
            </title>
            <aug>
               <au>
                  <snm>Petricoin</snm>
                  <fnm>EF</fnm>
               </au>
               <au>
                  <snm>Ardekani</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Hitt</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Fusaro</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Steinberg</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Mills</snm>
                  <fnm>GB</fnm>
               </au>
               <au>
                  <snm>Simone</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fishman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Kohn</snm>
                  <fnm>EC</fnm>
               </au>
               <etal/>
            </aug>
            <source>Lancet</source>
            <pubdate>2002</pubdate>
            <volume>359</volume>
            <fpage>572</fpage>
            <lpage>577</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0140-6736(02)07746-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">11867112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer.</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Rosenzweig</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>DW</fnm>
               </au>
            </aug>
            <source>Clin Chem</source>
            <pubdate>2002</pubdate>
            <volume>48</volume>
            <fpage>1296</fpage>
            <lpage>1304</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12142387</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Abbott</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>McMurray</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Mor</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Stone</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <inpress/>
         </bibl>
         <bibl id="B17">
            <title>
               <p>SwissProt</p>
            </title>
            <url>http://www.expasy.ch/sprot/</url>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry.</p>
            </title>
            <aug>
               <au>
                  <snm>Han</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Eng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2001</pubdate>
            <volume>19</volume>
            <fpage>946</fpage>
            <lpage>951</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1001-946</pubid>
                  <pubid idtype="pmpid" link="fulltext">11581660</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>An automated multidimensional protein identification technology for shotgun proteomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Wolters</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Washburn</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
                  <suf>3rd</suf>
               </au>
            </aug>
            <source>Anal Chem</source>
            <pubdate>2001</pubdate>
            <volume>73</volume>
            <fpage>5683</fpage>
            <lpage>5690</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/ac010617e</pubid>
                  <pubid idtype="pmpid">11774908</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Analysis of quantitative proteomic data generated via multidimensional protein identification technology.</p>
            </title>
            <aug>
               <au>
                  <snm>Washburn</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Ulaszek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Deciu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Schieltz</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
                  <suf>3rd</suf>
               </au>
            </aug>
            <source>Anal Chem</source>
            <pubdate>2002</pubdate>
            <volume>74</volume>
            <fpage>1650</fpage>
            <lpage>1657</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/ac015704l</pubid>
                  <pubid idtype="pmpid">12043600</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Protein pathway and complex clustering of correlated mRNA and protein expression analyses in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Washburn</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Oshiro</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Ulaszek</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Plouffe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Deciu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Winzeler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
                  <suf>3rd</suf>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>3107</fpage>
            <lpage>3112</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0634629100</pubid>
                  <pubid idtype="pmpid" link="fulltext">12626741</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.</p>
            </title>
            <aug>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gaasenbeek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Coller</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caligiuri</snm>
                  <fnm>MA</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>531</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5439.531</pubid>
                  <pubid idtype="pmpid" link="fulltext">10521349</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Application of microarrays to the analysis of gene expression in cancer.</p>
            </title>
            <aug>
               <au>
                  <snm>Macgregor</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Squire</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Clin Chem</source>
            <pubdate>2002</pubdate>
            <volume>48</volume>
            <fpage>1170</fpage>
            <lpage>1177</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12142369</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A comparison of selected mRNA and protein abundances in human liver.</p>
            </title>
            <aug>
               <au>
                  <snm>Anderson</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Seilhamer</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Electrophoresis</source>
            <pubdate>1997</pubdate>
            <volume>18</volume>
            <fpage>533</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9150937</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Different mRNA and protein expression of matrix metalloproteinases 2 and 9 and tissue inhibitor of metalloproteinases 1 in benign and malignant prostate tissue.</p>
            </title>
            <aug>
               <au>
                  <snm>Lichtinghagen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Musholt</snm>
                  <fnm>PB</fnm>
               </au>
               <au>
                  <snm>Lein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Romer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rudolph</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kristiansen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hauptmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Schnorr</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Loening</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Jung</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Eur Urol</source>
            <pubdate>2002</pubdate>
            <volume>42</volume>
            <fpage>398</fpage>
            <lpage>406</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0302-2838(02)00324-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">12361907</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Discordant protein and mRNA expression in lung adenocarcinomas.</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gharib</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Misek</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Kardia</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Giordano</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Iannettoni</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Orringer</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Hanash</snm>
                  <fnm>SM</fnm>
               </au>
               <etal/>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>304</fpage>
            <lpage>313</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.M200008-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12096112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Genome-wide study of gene copy numbers, transcripts, and protein levels in pairs of non-invasive and invasive human transitional cell carcinomas.</p>
            </title>
            <aug>
               <au>
                  <snm>Orntoft</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Thykjaer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Waldman</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Celis</snm>
                  <fnm>JE</fnm>
               </au>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>37</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.M100019-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12096139</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>A sampling of the yeast proteome.</p>
            </title>
            <aug>
               <au>
                  <snm>Futcher</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Latter</snm>
                  <fnm>GI</fnm>
               </au>
               <au>
                  <snm>Monardo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>McLaughlin</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Garrels</snm>
                  <fnm>JI</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1999</pubdate>
            <volume>19</volume>
            <fpage>7357</fpage>
            <lpage>7368</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">84729</pubid>
                  <pubid idtype="pmpid" link="fulltext">10523624</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts.</p>
            </title>
            <aug>
               <au>
                  <snm>Greenbaum</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>585</fpage>
            <lpage>596</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.4.585</pubid>
                  <pubid idtype="pmpid" link="fulltext">12016056</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Interrelating different types of genomic data, from proteome to secretome: 'oming in on function.</p>
            </title>
            <aug>
               <au>
                  <snm>Greenbaum</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Luscombe</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Qian</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1463</fpage>
            <lpage>1468</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.207401</pubid>
                  <pubid idtype="pmpid" link="fulltext">11544189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Large-scale analysis of the yeast proteome by multidimensional protein identification technology.</p>
            </title>
            <aug>
               <au>
                  <snm>Washburn</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Wolters</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
                  <suf>3rd</suf>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2001</pubdate>
            <volume>19</volume>
            <fpage>242</fpage>
            <lpage>247</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/85686</pubid>
                  <pubid idtype="pmpid" link="fulltext">11231557</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome.</p>
            </title>
            <aug>
               <au>
                  <snm>Peng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Elias</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Thoreen</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Licklider</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Gygi</snm>
                  <fnm>SP</fnm>
               </au>
            </aug>
            <source>J Proteome Res</source>
            <pubdate>2003</pubdate>
            <volume>2</volume>
            <fpage>43</fpage>
            <lpage>50</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/pr025556v</pubid>
                  <pubid idtype="pmpid">12643542</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Gerstein Lab - Supplementary data tables</p>
            </title>
            <url>http://bioinfo.mbb.yale.edu/expression/prot-v-mrna/</url>
         </bibl>
         <bibl id="B34">
            <title>
               <p>MIPS database</p>
            </title>
            <url>http://mips.gsf.de/</url>
         </bibl>
         <bibl id="B35">
            <title>
               <p>MIPS: a database for genomes and protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Guldener</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Mannhaupt</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mokrejs</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Morgenstern</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Munsterkotter</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rudd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weil</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>31</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99165</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752246</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.31</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes.</p>
            </title>
            <aug>
               <au>
                  <snm>Baldi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>509</fpage>
            <lpage>519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.6.509</pubid>
                  <pubid idtype="pmpid" link="fulltext">11395427</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Genetic network analysis in light of massively parallel biological data acquisition.</p>
            </title>
            <aug>
               <au>
                  <snm>Szallasi</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>1999</pubdate>
            <fpage>5</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10380181</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>A genome-wide transcriptional analysis of the mitotic cell cycle.</p>
            </title>
            <aug>
               <au>
                  <snm>Cho</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Winzeler</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Steinmetz</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Conway</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wodicka</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wolfsberg</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Gabrielian</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Landsman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lockhart</snm>
                  <fnm>DJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Mol Cell</source>
            <pubdate>1998</pubdate>
            <volume>2</volume>
            <fpage>65</fpage>
            <lpage>73</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9702192</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Genome-wide analysis of mRNA translation profiles in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Arava</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Herschlag</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>3889</fpage>
            <lpage>3894</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0635171100</pubid>
                  <pubid idtype="pmpid" link="fulltext">12660367</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction.</p>
            </title>
            <aug>
               <au>
                  <snm>Glickman</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Ciechanover</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Physiol Rev</source>
            <pubdate>2002</pubdate>
            <volume>82</volume>
            <fpage>373</fpage>
            <lpage>428</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11917093</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Dynamics of protein turnover, a missing dimension in proteomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Pratt</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Petty</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Riba-Garcia</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Robertson</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Gaskell</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Beynon</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>579</fpage>
            <lpage>591</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.M200046-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12376573</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Genomic and proteomic analysis of the myeloid differentiation program: global analysis of gene expression during induced differentiation in the MPRO cell line.</p>
            </title>
            <aug>
               <au>
                  <snm>Lian</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kluger</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Greenbaum</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Tuck</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Berliner</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Newburger</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Blood</source>
            <pubdate>2002</pubdate>
            <volume>100</volume>
            <fpage>3209</fpage>
            <lpage>3220</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1182/blood-2002-03-0850</pubid>
                  <pubid idtype="pmpid" link="fulltext">12384419</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Concomitant determination of absolute values of cellular protein amounts, synthesis rates, and turnover rates by quantitative proteome profiling.</p>
            </title>
            <aug>
               <au>
                  <snm>Gerner</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Vejda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gelbmann</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bayer</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gotzmann</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schulte-Hermann</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mikulits</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>1</volume>
            <fpage>528</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.M200026-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12239281</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>The transcriptome and its translation during recovery from cell cycle arrest in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Serikawa</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>XL</fnm>
               </au>
               <au>
                  <snm>MacKay</snm>
                  <fnm>VL</fnm>
               </au>
               <au>
                  <snm>Law</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Zong</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>LP</fnm>
               </au>
               <au>
                  <snm>Bumgarner</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Morris</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Mol Cell Proteomics</source>
            <pubdate>2003</pubdate>
            <volume>2</volume>
            <fpage>191</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/mcp.D200002-MCP200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12684541</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Codon selection in yeast.</p>
            </title>
            <aug>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>BD</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1982</pubdate>
            <volume>257</volume>
            <fpage>3026</fpage>
            <lpage>3031</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7037777</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1987</pubdate>
            <volume>15</volume>
            <fpage>1281</fpage>
            <lpage>1295</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3547335</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models.</p>
            </title>
            <aug>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>2242</fpage>
            <lpage>2251</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">153734</pubid>
                  <pubid idtype="pmpid" link="fulltext">12682375</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg306</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Relationship of codon bias to mRNA concentration and protein length in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Coghlan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>KH</fnm>
               </au>
            </aug>
            <source>Yeast</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>1131</fpage>
            <lpage>1145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1097-0061(20000915)16:12&lt;1131::AID-YEA609>3.0.CO;2-F</pubid>
                  <pubid idtype="pmpid" link="fulltext">10953085</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Relating whole-genome expression data with protein-protein interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Greenbaum</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>37</fpage>
            <lpage>46</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155252</pubid>
                  <pubid idtype="pmpid" link="fulltext">11779829</pubid>
                  <pubid idtype="doi">10.1101/gr.205602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Identification and correction of spurious spatial correlations in microarray data.</p>
            </title>
            <aug>
               <au>
                  <snm>Qian</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kluger</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Biotechniques</source>
            <pubdate>2003</pubdate>
            <volume>35</volume>
            <fpage>42</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12866403</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Fluorescence two-dimensional difference gel electrophoresis and mass spectrometry based proteomic analysis of <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Yan</snm>
                  <fnm>JX</fnm>
               </au>
               <au>
                  <snm>Devenish</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Wait</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stone</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fowler</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2002</pubdate>
            <volume>2</volume>
            <fpage>1682</fpage>
            <lpage>98</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1615-9861(200212)2:12&lt;1682::AID-PROT1682>3.0.CO;2-Y</pubid>
                  <pubid idtype="pmpid" link="fulltext">12469338</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Subcellular localization of the yeast proteome.</p>
            </title>
            <aug>
               <au>
                  <snm>Kumar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Heyman</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Matson</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Heidtman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Piccirillo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Umansky</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Drawid</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <fpage>707</fpage>
            <lpage>719</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155358</pubid>
                  <pubid idtype="pmpid" link="fulltext">11914276</pubid>
                  <pubid idtype="doi">10.1101/gad.970902</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
