<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-6-176</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Quantitative analysis of EGR proteins binding to DNA: assessing additivity in both the binding site and the protein</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Liu</snm>
               <fnm>Jiajian</fnm>
               <insr iid="I1"/>
               <email>jjliu@ural.wustl.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Stormo</snm>
               <mi>D</mi>
               <fnm>Gary</fnm>
               <insr iid="I1"/>
               <email>stormo@genetics.wustl.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Genetics, Washington University School of Medicine, 660 S Euclid, Box 8232, St. Louis, MO 63110, U.S.A</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2005</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>176</fpage>
         <url>http://www.biomedcentral.com/1471-2105/6/176</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16014175</pubid>
               <pubid idtype="doi">10.1186/1471-2105-6-176</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>4</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>13</day>
               <month>7</month>
               <year>2005</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>13</day>
               <month>7</month>
               <year>2005</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2005</year>
         <collab>Liu and Stormo; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Recognition codes for protein-DNA interactions typically assume that the interacting positions contribute additively to the binding energy. While this is known to not be precisely true, an additive model over the DNA positions can be a good approximation, at least for some proteins. Much less information is available about whether the protein positions contribute additively to the interaction.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Using EGR zinc finger proteins, we measure the binding affinity of six different variants of the protein to each of six different variants of the consensus binding site. Both the protein and binding site variants include single and double mutations that allow us to assess how well additive models can account for the data. For each protein and DNA alone we find that additive models are good approximations, but over the combined set of data there are context effects that limit their accuracy. However, a small modification to the purely additive model, with only three additional parameters, improves the fit significantly.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The additive model holds very well for every DNA site and every protein included in this study, but clear context dependence in the interactions was detected. A simple modification to the independent model provides a better fit to the complete data.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Zinc finger proteins are the largest family of transcription factors in the human genome. The EGR sub-family of C2H2 zinc finger proteins has been extensively studied to determine the basis of DNA-protein binding specificity. The structure of the DNA-protein complex has been determined for the wild-type EGR1 (zif268) protein bound to its consensus site <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp> and for several other variants of the interaction <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. From the structure, the interaction appears very modular with each protein containing several zinc finger domains and each finger interacting with adjacent 3 base-pair (or overlapping 4 base-pair) segments of the binding site. Analysis of binding sites for this family of proteins suggested there were simple rules that relate the sequence of the zinc finger protein to its preferred binding site sequence <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, and that those rules could be used to design proteins with desired specificities <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. Soon after, experimental techniques of <it>in vitro </it>randomization and selection were employed to greatly expand the collection of protein-DNA high affinity interactions <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. Several reviews <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp> have analyzed the protein-DNA crystal structures, summarized the results of the <it>in vitro </it>selection experiments, described rules for predicting high affinity protein-DNA interacting pairs and assessed the success of those rules for designing proteins to recognize particular sequences. Most of the recognition rules that have been developed are qualitative, specifying the amino acid and base-pair combinations that are preferred at each position in the binding sites <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Such rules can be effectively used to design proteins with preferred binding sites that are desired <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
         <p>Despite the success of the qualitative recognition codes for designing proteins with desired preferred binding sites, the utility of such codes is still quite limited. If one compares the collection of known protein-DNA interacting pairs obtained in <it>in vitro </it>selection experiments, more than half of the fingers contain at least one amino acid/base-pair interaction that is not included in the code <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Furthermore, the code only predicts the preferred binding site for each protein sequence, or preferred protein for each DNA binding site. But it does not, by its qualitative nature, attempt to predict differences in affinities to similar sequences. Because all of these proteins bind with limited specificity, sites that are very similar to the preferred binding site can often bind with only slightly reduced affinity. Therefore predicting the quantitative binding specificities is important for a comprehensive view of their functions.</p>
         <p>Several quantitative binding models have been developed, either specifically for the zinc finger proteins or for general protein-DNA interactions <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. In many cases such codes can accurately predict the preferred binding sites as well as the qualitative codes, but the overall accuracy of the quantitative predictions is limited, undoubtedly for a combination of reasons. One reason is that there are limited data upon which to infer the model parameters using statistical approaches. Another reason is that many of the models are overly simplified, for instance assuming that each amino acid/base-pair contact is independent of any of the surrounding structure. We know, for instance, that the interactions of the protein and DNA are not completely additive <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>, and it is also known that both intermolecular and intramolecular interactions contribute to protein-DNA recognition (24). But it has also been shown that models which are additive over the DNA positions can be a reasonably good approximations, at least for some proteins <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. Most studies of additivity have focused on the DNA binding site, testing whether independent models for each base-pair fit the binding data well <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. But equally important to the recognition codes is whether additivity holds within the protein. In one example from the EGR family, additivity within the protein was shown to be approximately additive (within 0.5 kcal) for one pair of mutated amino acids <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. But very few studies have addressed the issue. Even though many variants of EGR family proteins have been used in SELEX and phage-display selection studies (see <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> for a summary), very few of the affinities have been quantified. Bulyk et al <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> did measure the affinity to each of 64 different binding sites for five different proteins, but the proteins were different at too many positions to be useful for determining additivity. One needs to have a set of single mutations and their double mutant combinations in order to determine whether the contributions to binding are independent or not. Several structural studies have highlighted the substantial rearrangements that can occur at the protein-DNA interface and can cause single amino acid or base-pair substitutions to influence the interactions at neighboring positions <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B15">15</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. Such context effects may limit the predictive accuracy of simple recognition codes, although it is also possible that additivity can hold approximately even in the presence of such rearrangements. In the Mnt protein, a single amino acid change can alter the preferred binding site primarily at two adjacent positions, and more weakly over a longer distance <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. Nevertheless, a complete quantitative analysis of the adjacent positions that were primarily affected showed that the interaction was largely additive for a wide variety of amino acid substitutions <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>.</p>
         <p>In this study we analyze the additivity of the interaction in both the DNA binding sites and in the interacting positions of the protein. We measure binding affinities for each of six different proteins, with single and double mutations compared to the wild-type protein, to each of six different DNA sites, also with single and double mutations from the wild-type binding site. We show that for any specific protein or DNA an additive model fits the data quite well. However, there are clear context effects such that no single interaction model fits all of the protein-DNA combinations. But only a small modification to the additive model, with just three additional parameters, improves the fit significantly.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>Figure <figr fid="F1">1</figr> diagrams the direct interactions between the amino acids of finger 1 of the zif268 protein with the bases of the consensus binding site as determined by X-ray crystallography <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. In order to study the additivity of the interaction on the side of protein, we constructed wild-type zif268 and five mutants where mutations occur in finger one. These five mutants include two single mutants of zif268 at position -1 in which arginine (R18) (referred to as RE) was replaced by glutamine (Q) (referred to as QE) and aspartic acid (D) (referred to as DE), separately, one single mutant at position +3 where glutamic acid (E21) was mutated to asparagine (N) (referred to as RN), and two corresponding double mutants (referred as to QN and DN, respectively). The six DNA sites used for this study were chosen primarily based on the qualitative code that represents the correlations between amino acids located at different positions and the DNA bases that they specify <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B15">15</abbr><abbr bid="B34">34</abbr></abbrgrp>. Specifically, the anticipated base specificity for amino acids arginine, glutamine and aspartic acid at position -1 are G, A and C at position 9 in the DNA sequence, respectively. The favorable bases for amino acids glutamic acid and asparagine at position +3 are C and A at position 8. The oligos used to generate the six DNA sites are shown in Table <tblr tid="T1">1</tblr>. They share common sequences except for the DNA bases that are recognized by the amino acids at the position of +3 and -1 of finger 1, referred as CG, CA, CC, AG, AA, and AC, respectively. We measured the affinity of each of six proteins to each of six DNA sites, and we use these data to analyze the additivity in both the protein and the DNA binding sites.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Amino acid-base contacts observed in co-crystal structures</p>
            </caption>
            <text>
               <p>Amino acid-base contacts observed in co-crystal structures. The amino acid residues at -1, +2, +3, and +6 for zif268 are R, D, E and R, while the DNA bases at positions 7, 8, 9 and 10 for wild-type operator of zif268 are G, C, G and T.</p>
            </text>
            <graphic file="1471-2105-6-176-1"/>
         </fig>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Oligos applied in this study. I: Synthesized DNA templates bearing either wild-type binding site (Zif_1) for zif268 or one of its variants (Zif_2 to Zif_6) used for generating DNA binding sites by PCR amplification, where KS-1 and SK-1 are two primers (low case). II: Oligos employed to construct five zif268 variants with QuickChange&#8482; XL site-directed mutagenesis Kit (Stratagene) using pzif268 as a template.</p>
            </caption>
            <tblbdy cols="2">
               <r>
                  <c ca="center">
                     <p>I</p>
                  </c>
                  <c ca="left">
                     <p>Zif_1 tcgaggtcgacggtatcGCGTGGGCGCtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Zif_2 tcgaggtcgacggtatcGCGTGGGCACtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Zif_3 tcgaggtcgacggtatcGCGTGGGCCCtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Zif_4 tcgaggtcgacggtatcGCGTGGGAGCtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Zif_5 tcgaggtcgacggtatcGCGTGGGAACtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>Zif_6 tcgaggtcgacggtatcGCGTGGGACCtccactagttctagagcggccgccac</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>KS-1 tcgaggtcgacggtatc</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SK*-1 gtggcggccgctctagaact (SK-1 was fluorescent labeled with either FAM, HEX, TAMRA, ROX, or CY5)</p>
                  </c>
               </r>
               <r>
                  <c cspan="2">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>II</p>
                  </c>
                  <c ca="left">
                     <p>18Q_plus 5' CGCCGCTTTTCTcagTCGGATGAGCTTACCCGCC</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18Q_minus 5' GGCGGGTAAGCTCATCCGActgAGAAAAGCGGCG</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18D_plus 5' CGCCGCTTTTCTgatTCGGATGAGCTTACCCGCC</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18D_minus 5' GGCGGGTAAGCTCATCCGAatcAGAAAAGCGGCG</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>21N_plus 5' CGCCGCTTTTCTCGCTCGGATaacCTTACCCGCC</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>21N_minus 5' GGCGGGTAAGgttATCCGAGCGAGAAAAGCGGCG</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18Q_21N_plus 5' CGCCGCTTTTCTcagTCGGATaacCTTACCCGCC</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18Q_21N_minus 5' GGCGGGTAAGgttATCCGActgAGAAAAGCGGCG</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18D_21N_plus 5' CGCCGCTTTTCTgatTCGGATaacCTTACCCGCC</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>18D_21N_minus 5' GGCGGGTAAGgttATCCGAatcAGAAAAGCGGCG</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p>For each protein we determined the relative affinity of each different binding site compared to the wild type site (CG) using the QuMFRA assay (Table <tblr tid="T2">2</tblr>). For the wild-type protein, the relative affinities of CA, CC, and AG to the reference site CG in this study are 0.27, 0.082 and 0.15, respectively. These data are in good agreement with the relative affinities previously determined by Miller and Pabo (0.21, 0.ll and 0.20, respectively <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>). Table <tblr tid="T2">2</tblr> shows only the wild-type protein (RE) binds preferentially to the wild-type binding site (CG), all of the other proteins preferring a different binding site sequence. The range of affinities varies considerably between the different proteins. RE has about a 25-fold difference between the highest and lowest sites, while QE only varies by about 2-fold between the highest and lowest. We also measured the absolute binding affinity of each protein to one of the DNA binding sites with a Scatchard analysis (Table <tblr tid="T3">3</tblr>). The K<sub>d </sub>for wildtype zif268 binding to the DNA site CC is 3.0 &#215; 10<sup>-8 </sup>M, which converts to a K<sub>d </sub>for wildtype binding site CG of 2.5 &#215; 10<sup>-9 </sup>M. This value is almost the same as that determined by Hamilton et al (2.2 &#215; 10<sup>-9 </sup>M) <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> (previously reported values for this K<sub>d </sub>range from 0.04 to 6.5 nM, depending on the binding condition used <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>). No similar data exist for the other proteins in our collection. Combining the data from Tables <tblr tid="T2">2</tblr> and <tblr tid="T3">3</tblr>, we derive the association constant of each protein for each different DNA sequence, which differ by over 300-fold between the highest and lowest affinities (Table <tblr tid="T4">4</tblr>).</p>
         <tbl id="T2">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Relative binding constants for six DNA binding sites for wild-type of zif268 and its 5 derivatives, where wild-type operator of zif268 was used as the reference. Each data were obtained from 5 or more independent examinations, inside of parenthesis are the standard deviations.</p>
            </caption>
            <tblbdy cols="7">
               <r>
                  <c ca="center">
                     <p>DNA\Prot</p>
                  </c>
                  <c ca="center">
                     <p>RE(wt)</p>
                  </c>
                  <c ca="center">
                     <p>QE</p>
                  </c>
                  <c ca="center">
                     <p>DE</p>
                  </c>
                  <c ca="center">
                     <p>RN</p>
                  </c>
                  <c ca="center">
                     <p>QN</p>
                  </c>
                  <c ca="center">
                     <p>DN</p>
                  </c>
               </r>
               <r>
                  <c cspan="7">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CG(wt)</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CA</p>
                  </c>
                  <c ca="center">
                     <p>0.27(0.06)</p>
                  </c>
                  <c ca="center">
                     <p>1.50(0.54)</p>
                  </c>
                  <c ca="center">
                     <p>1.16(0.49)</p>
                  </c>
                  <c ca="center">
                     <p>0.36(0.14)</p>
                  </c>
                  <c ca="center">
                     <p>0.49(0.19)</p>
                  </c>
                  <c ca="center">
                     <p>1.21(0.33)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CC</p>
                  </c>
                  <c ca="center">
                     <p>0.082(0.076)</p>
                  </c>
                  <c ca="center">
                     <p>2.17(0.91)</p>
                  </c>
                  <c ca="center">
                     <p>1.91(0.83)</p>
                  </c>
                  <c ca="center">
                     <p>0.41(0.23)</p>
                  </c>
                  <c ca="center">
                     <p>0.53(0.36)</p>
                  </c>
                  <c ca="center">
                     <p>2.61(0.59)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AG</p>
                  </c>
                  <c ca="center">
                     <p>0.15(0.10)</p>
                  </c>
                  <c ca="center">
                     <p>1.30(0.34)</p>
                  </c>
                  <c ca="center">
                     <p>1.48(0.56)</p>
                  </c>
                  <c ca="center">
                     <p>1.29(0.28)</p>
                  </c>
                  <c ca="center">
                     <p>4.45(2.64)</p>
                  </c>
                  <c ca="center">
                     <p>14.5(5.18)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AA</p>
                  </c>
                  <c ca="center">
                     <p>0.064(0.017)</p>
                  </c>
                  <c ca="center">
                     <p>1.36(0.48)</p>
                  </c>
                  <c ca="center">
                     <p>2.25(1.30)</p>
                  </c>
                  <c ca="center">
                     <p>0.68(0.28)</p>
                  </c>
                  <c ca="center">
                     <p>2.47(1.34)</p>
                  </c>
                  <c ca="center">
                     <p>4.02(1.56)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AC</p>
                  </c>
                  <c ca="center">
                     <p>0.041(0.045)</p>
                  </c>
                  <c ca="center">
                     <p>1.93(1.01)</p>
                  </c>
                  <c ca="center">
                     <p>3.08(0.45)</p>
                  </c>
                  <c ca="center">
                     <p>0.94(0.26)</p>
                  </c>
                  <c ca="center">
                     <p>2.78(0.80)</p>
                  </c>
                  <c ca="center">
                     <p>11.8(4.44)</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T3">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Experimental determined association constants (10<sup>6</sup>M<sup>-1</sup>) for individual indicated DNA binding site binding to its corresponding protein. Each value is the mean from 5 or more independent determinations and the standard deviations are shown in parenthesis.</p>
            </caption>
            <tblbdy cols="7">
               <r>
                  <c ca="center">
                     <p>DNA\Prot</p>
                  </c>
                  <c ca="center">
                     <p>RE(wt)</p>
                  </c>
                  <c ca="center">
                     <p>QE</p>
                  </c>
                  <c ca="center">
                     <p>DE</p>
                  </c>
                  <c ca="center">
                     <p>RN</p>
                  </c>
                  <c ca="center">
                     <p>QN</p>
                  </c>
                  <c ca="center">
                     <p>DN</p>
                  </c>
               </r>
               <r>
                  <c cspan="7">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CC</p>
                  </c>
                  <c ca="center">
                     <p>33(7)</p>
                  </c>
                  <c ca="center">
                     <p>6.4(1.7)</p>
                  </c>
                  <c ca="center">
                     <p>4.7(2.6)</p>
                  </c>
                  <c ca="center">
                     <p>33(14)</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AG</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="center">
                     <p>33(18)</p>
                  </c>
                  <c ca="center">
                     <p>17(6)</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T4">
            <title>
               <p>Table 4</p>
            </title>
            <caption>
               <p>Absolute <it>K</it><sub><it>a</it></sub>(10<sup>6</sup>M<sup>-1</sup>) for six DNA binding sites and six variants of zif268, derived from the combination of Table 2 and Table 3.</p>
            </caption>
            <tblbdy cols="7">
               <r>
                  <c ca="center">
                     <p>DNA\Prot</p>
                  </c>
                  <c ca="center">
                     <p>RE</p>
                  </c>
                  <c ca="center">
                     <p>QE</p>
                  </c>
                  <c ca="center">
                     <p>DE</p>
                  </c>
                  <c ca="center">
                     <p>RN</p>
                  </c>
                  <c ca="center">
                     <p>QN</p>
                  </c>
                  <c ca="center">
                     <p>DN</p>
                  </c>
               </r>
               <r>
                  <c cspan="7">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CG</p>
                  </c>
                  <c ca="center">
                     <p>406</p>
                  </c>
                  <c ca="center">
                     <p>3.0</p>
                  </c>
                  <c ca="center">
                     <p>2.5</p>
                  </c>
                  <c ca="center">
                     <p>81</p>
                  </c>
                  <c ca="center">
                     <p>7.4</p>
                  </c>
                  <c ca="center">
                     <p>1.2</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CA</p>
                  </c>
                  <c ca="center">
                     <p>109</p>
                  </c>
                  <c ca="center">
                     <p>4.5</p>
                  </c>
                  <c ca="center">
                     <p>2.8</p>
                  </c>
                  <c ca="center">
                     <p>30</p>
                  </c>
                  <c ca="center">
                     <p>3.5</p>
                  </c>
                  <c ca="center">
                     <p>1.4</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>CC</p>
                  </c>
                  <c ca="center">
                     <p>33</p>
                  </c>
                  <c ca="center">
                     <p>6.4</p>
                  </c>
                  <c ca="center">
                     <p>4.7</p>
                  </c>
                  <c ca="center">
                     <p>33</p>
                  </c>
                  <c ca="center">
                     <p>3.9</p>
                  </c>
                  <c ca="center">
                     <p>3.1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AG</p>
                  </c>
                  <c ca="center">
                     <p>63</p>
                  </c>
                  <c ca="center">
                     <p>3.9</p>
                  </c>
                  <c ca="center">
                     <p>3.6</p>
                  </c>
                  <c ca="center">
                     <p>105</p>
                  </c>
                  <c ca="center">
                     <p>33</p>
                  </c>
                  <c ca="center">
                     <p>17</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AA</p>
                  </c>
                  <c ca="center">
                     <p>26</p>
                  </c>
                  <c ca="center">
                     <p>4.0</p>
                  </c>
                  <c ca="center">
                     <p>3.7</p>
                  </c>
                  <c ca="center">
                     <p>56</p>
                  </c>
                  <c ca="center">
                     <p>18</p>
                  </c>
                  <c ca="center">
                     <p>4.8</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>AC</p>
                  </c>
                  <c ca="center">
                     <p>16</p>
                  </c>
                  <c ca="center">
                     <p>5.7</p>
                  </c>
                  <c ca="center">
                     <p>5.5</p>
                  </c>
                  <c ca="center">
                     <p>77</p>
                  </c>
                  <c ca="center">
                     <p>21</p>
                  </c>
                  <c ca="center">
                     <p>14</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p>From the binding data we can assess the additivity of the interaction for both the protein and the DNA. In a perfectly additive interaction the binding energy for each sequence would be the sum of the independent contributions at each position. For example, for any protein <it>j</it>, the binding energy to any DNA sequence <it>XY</it>, would be the sum of the interactions with base <it>X </it>and base <it>Y</it>:</p>
         <p>&#916;<it>G</it><sub><it>j</it></sub>(<it>X</it><sub>8</sub><it>Y</it><sub>9</sub>) = &#916;<it>G</it><sub><it>j</it></sub>(<it>X</it><sub>8</sub>) + &#916;<it>G</it><sub><it>j</it></sub>(<it>Y</it><sub>9</sub>). &#160;&#160;&#160; (1)</p>
         <p>The important assumption of the additive model is that the interaction energy at position 8, for example, doesn't depend on which base occurs at position 9. We do not expect additivity to hold precisely <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>, but it can be a very good approximation, at least for some proteins <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B29">29</abbr></abbrgrp>. Previously, studies of additivity have focused on whether the positions in the DNA binding site contribute independently to the binding of a particular protein. Using the data of Table <tblr tid="T4">4</tblr> we can also determine whether the positions in the protein contribute additively to the binding of a particular DNA site. That is, we can reverse the symbols of equation 1 to refer to the binding of a particular DNA sequence, <it>i</it>, to a protein sequence <it>UV</it>:</p>
         <p>&#916;<it>G</it><sub><it>i</it></sub>(<it>U</it><sub>-1</sub><it>V</it><sub>3</sub>) = &#916;<it>G</it><sub><it>i</it></sub>(<it>U</it><sub>-1</sub>) + &#916;<it>G</it><sub><it>i</it></sub>(<it>V</it><sub>3</sub>). &#160;&#160;&#160; (2)</p>
         <p>Of course, we have not measured affinities to all possible DNA sequences or for all possible protein sequences, but because we have both single and double mutants in both the protein and the DNA, and have measured the binding affinities of all combinations, we can determine how well additivity holds on both sides, the DNA and the protein, at least for this limited set of variants.</p>
         <p>We cannot actually measure the binding affinities to single positions because they always occur in some context. But we can find the "best fit" values for the independent interactions, and then determine how well the total data fits the additive model using those values. One method to obtain the best fit independent parameters is to apply multiple linear regression to the total data <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. However, we have argued previously <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> that a better criterion is to minimize the difference in total free energy between the observed data and the model.</p>
         <p>
            <graphic file="1471-2105-6-176-i1.gif"/>
         </p>
         <p>The <graphic file="1471-2105-6-176-i2.gif"/> and <graphic file="1471-2105-6-176-i3.gif"/> values are those obtained as the best fit parameters (those which minimize <it>M</it>) for each position assuming independence. The <it>&#969; </it>refers to either the protein or the DNA, and <it>&#945;,&#946; </it>refer to the residues at the two interacting positions. The first term inside the sum represents the probability that each particular residue sequence will be bound, and so weights the energy differences by their contribution to the total free energy of the system. As can be seen in the last form of the equation, <it>M </it>is the "mutual information" between the positions, the amount of total information content in the data that cannot be explained by the best independent model. We use log<sub>2 </sub>so that the mutual information is measured in bits.</p>
         <p>Given the best fit independent parameters we can calculate the specificity information, <it>I</it><sub><it>spec</it></sub>, of each position independently <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. For example the specificity information for the protein or DNA <it>&#969; </it>at the first interacting position is</p>
         <p>
            <graphic file="1471-2105-6-176-i4.gif"/>
         </p>
         <p><it>I</it><sub><it>spec </it></sub>measures the amount of specificity in the interaction in bits; any non-specific protein or DNA would have <it>I</it><sub><it>spec </it></sub>= 0. Figure <figr fid="F2">2</figr> shows sequence logos <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> for each of the six proteins and the six DNA sequences for which we have measured the affinity. We have added the symbol "M" to each one which shows the amount of mutual information in each interaction <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B27">27</abbr><abbr bid="B30">30</abbr></abbrgrp>. That is the amount of total free energy, or specificity information, which is not captured by the best fit additive model. Half of the total mutual information is displayed above each position.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Sequence logos for each of six zinc finger proteins and the six DNA sites for which we have measured the affinity</p>
            </caption>
            <text>
               <p>Sequence logos for each of six zinc finger proteins and the six DNA sites for which we have measured the affinity. M in each logo is the mutual information content in each interaction. The label at the top of each logo represents the DNA site (for the top two rows) or the protein (bottom two rows). The amino acid order is reversed so that they are lined up with the bases they contact. For example, the logo labeled "ER" shows the specificity for the RE (wild type) protein. In the lower six panels the maximum value on the y-axis is 0.5 bits.</p>
            </text>
            <graphic file="1471-2105-6-176-2"/>
         </fig>
         <p>Several interesting results are evident in Figure <figr fid="F2">2</figr>. As stated above, the proteins vary considerably in their specificity, with RE (shown as "ER" in the figure) showing large discrimination between the different DNA sites, whereas QE and DE are fairly non-specific. The same holds for the different DNA sites, where CG is much more specific than CC or AC. It is interesting that every DNA site prefers R at position -1 of the protein, showing that it contributes to the total affinity of each protein as well as to the specificity of some proteins. The small degree of mutual information, the "M" in each logo, means that every interaction fits well with an additive model. Not only do the DNA positions contribute very additively, as has been shown previously for this family of proteins <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, but the contributions of the amino acids in the protein are also largely additive. The conclusion that additive models are good approximations to the true data holds for every DNA site and every protein included in the analysis. However, it is also true that there is not a single set of additive parameters that fit well for every case. This is consistent with the context effects previously noted for this family <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B34">34</abbr></abbrgrp>. For example, R prefers to bind to G over A or C, but the magnitude of that preference is much larger if position +3 is an E instead of N. And an N at position +3 always prefers an A over C in the binding site, but that preference is much weaker with an R at position -1 than with a Q or D. Similarly, E at position +3 prefers a C very strongly in the context of an R, but is quite non-specific with either a Q or D at position -1. Similar effects, but of smaller magnitude, can be seen in the context effects of the DNA sites. These results show that additive models can be good approximations not only for the DNA sites in binding to any particular protein as has been seen before <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, but also for the proteins in binding to any particular DNA site. But the results also show that additivity for specific proteins and DNA sites is not sufficient to generate a general recognition code because context effects can still be important when both the DNA and protein can be variable. The small amounts of mutual information observed for any specific protein or DNA can be reinforced to give much larger amounts when measured over combinations of both components.</p>
         <p>To get a more detailed view of the dependencies in the data, it is useful to reformat it as in Figure <figr fid="F3">3A</figr>. Those data are the same as in Table <tblr tid="T4">4</tblr> except that it has been normalized to a sum of 1000. In an experiment where every protein and DNA was equally available for binding, those elements in the table are 1000-times the probability of picking that particular combination from all of those in the bound state. The data are arranged in a four-dimensional (4D) table, with one dimension for each of the two positions in the protein and the two positions in the DNA. For example, the 335 at the RE-CG element of the table corresponds to the wild-type association constant of 406 from Table <tblr tid="T4">4</tblr> after normalization. From the data in Figure <figr fid="F3">3A</figr> it is easy to obtain different lower dimensional views by summing over the other dimensions. For example, Figure <figr fid="F3">3B</figr> shows the 2D view of the interaction of the amino acid at position -1 with the base-pair at position 9 obtained by summing over all of the combinations of E,N at protein position +3 and C, A at binding site position 8 (inside the bold lines of Figure <figr fid="F3">3A</figr>). Similarly, Figure <figr fid="F3">3C</figr> shows a 2D view of the interaction between the amino acid at position +3 and the binding site position 8. Those two 2D views are orthogonal and together cover the 4D space of Figure <figr fid="F3">3A</figr>. We also show the remaining 2D views in Figures <figr fid="F3">3D&#8211;G</figr>. The pairs in Figure <figr fid="F3">3D,E</figr> and <figr fid="F3">3F,G</figr> are also orthogonal and together cover the 4D space of the data. If the binding interaction was completely additive, the true data of 3A could be calculated as the (renormalized) outer product of any pair of orthogonal matrices. Such predictions are not too bad, but demonstrate limitations of the additive model (see below).</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>DNA binding specificities for six DNA sites for zif268 and its five derivatives</p>
            </caption>
            <text>
               <p>DNA binding specificities for six DNA sites for zif268 and its five derivatives. <b>A: </b>four-dimensional table representing binding specificities for all DNA sites and zinc finger proteins in this study. It is converted from Table 4 by normalization to a sum of 1000; <b>B: </b>2D table of combinations for the interaction of the amino acid at position -1 with the base-pair at position 9; <b>C: </b>2D table of combinations for the interaction of the amino acid at position +3 with the binding site position 8; <b>D: </b>2D table of combinations between amino acids at position -1 and +3; <b>E: </b>2D table of combinations between DNA bases at position 8 and 9; <b>F: </b>2D table of combinations between amino acid position 3 and base position 9; <b>G: </b>2D table of combinations between amino acid position -1 and base position 8.</p>
            </text>
            <graphic file="1471-2105-6-176-3"/>
         </fig>
         <p>Because the data in Figure <figr fid="F3">3</figr> are in probabilities (if divided by 1000), the information specificity can be calculated more easily than in equation (4):</p>
         <p><it>I</it><sub><it>spec</it></sub>(<it>&#945;</it>) = log<sub>2</sub><it>N</it><sub><it>&#945; </it></sub>- <it>H</it><sub><it>&#945; </it></sub>&#160;&#160;&#160; (5)</p>
         <p>where <it>&#945; </it>is any of the positions or combination of positions, <it>H</it><sub><it>&#945; </it></sub>is the Shannon entropy of the data at those positions and <it>N</it><sub><it>&#945; </it></sub>is the number of entries in the data. For example, position -1 of the protein has three entries, R, Q and D, with overall probabilities of 0.852, 0.093 and 0.054, respectively, which gives <it>I</it><sub><it>spec</it></sub>(- 1) = 0.84 bits. The upper half of Table <tblr tid="T5">5</tblr> shows the specificity information for each of the positions (along the diagonal) as well as the specificity information for each of the pairs of positions (from the data shown in Figure <figr fid="F3">3</figr>). If the two positions contribute independently to the total specificity then the information for the paired positions is just the sum of the information at the each position. In this case the mutual information between the positions is the amount of information in the pair that exceeds the sum of the individual positions:</p>
         <tbl id="T5">
            <title>
               <p>Table 5</p>
            </title>
            <caption>
               <p>Information for the position dependence. The diagonal is the specificity information for each of positions -1, 3, 8, and 9. The upper half of the matrix is the specificity information for each of the pairs of positions, and the lower half is the mutual information between pairs of positions.</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="center">
                     <p>
                        <b>Position</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>-1</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>3</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>8</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>9</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <b>-1</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>0.84</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.91</p>
                  </c>
                  <c ca="center">
                     <p>0.94</p>
                  </c>
                  <c ca="center">
                     <p>1.09</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <b>3</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.05</p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>0.02</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.24</p>
                  </c>
                  <c ca="center">
                     <p>0.28</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <b>8</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.06</p>
                  </c>
                  <c ca="center">
                     <p>0.19</p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>0.03</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.29</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <b>9</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.02</p>
                  </c>
                  <c ca="center">
                     <p>0.04</p>
                  </c>
                  <c ca="center">
                     <p>0.04</p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>0.22</b>
                     </p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p><it>M</it>(<it>&#945;,&#946;</it>) = <it>I</it><sub><it>spec</it></sub>(<it>&#945;,&#946;</it>) - (<it>I</it><sub><it>spec</it></sub>(<it>&#945;</it>) + <it>I</it><sub><it>spec</it></sub>(<it>&#946;</it>)) &#160;&#160;&#160; (6)</p>
         <p>Those values are shown in the lower half of Table <tblr tid="T5">5</tblr>. From the standard model of interaction between the DNA and protein we would expect there to be very little mutual information for any of the 2D datasets of Figure <figr fid="F3">3D&#8211;G</figr>, and that expectation is met. But we do expect high mutual information for the datasets in Figure <figr fid="F3">3B</figr> and <figr fid="F3">3C</figr> because those are the interacting positions. Just as we get high mutual information for positions that interact in RNA structures <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, we expect to see compensating changes between the amino acids and base-pairs that interact. That expectation is met for the combination of protein position +3 and base-pair position 8 (Figure <figr fid="F3">3C</figr>) where there is a clear preference for E binding to C and for N binding to A. In that case the mutual information is 0.19 bits, which is the main contribution to the total information of that pair, 0.24 bits. However, protein position -1 and base-pair position 9 also interact but show little mutual information because R is the preferred amino acid for each different DNA sequence and G is the preferred base-pair for each different protein. That pair has high specificity information, 1.09 bits, but it is very additive with only 0.02 bits of mutual information.</p>
         <p>The total specificity information in the complete data of Figure <figr fid="F3">3A</figr> is 1.46 bits. The sum of the information for the interacting pairs, -1,9 and 3,8, is 1.33 bits, which shows that the complete specificity is reasonably well fit by assuming independent contributions from those interacting positions, as in most recognition code models <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. If one predicts the complete data of Figure <figr fid="F3">3A</figr> as the outer-product of the matrices of Figure <figr fid="F3">3B</figr> and <figr fid="F3">3C</figr> (not shown), the correlation coefficient between the observed and predicted binding energies is 0.87 (Model 1 of Figure <figr fid="F5">5</figr>), similar to what had been observed previously for data in which only the DNA site had been varied <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. While that result is reasonably good overall, examination of the complete data in Figure <figr fid="F3">3A</figr> identifies one clear source of context dependence between the interacting positions. When protein position -1 is R and the base-pair at position 9 is either G or A, there is a clear preference for the specific combination of E with C and a weak preference for N with A. But for all other combinations of positions -1 and 9, there is a strong preference for N with A, but very little preference for E. That is, the preference of E for C depends on the R with G or A combination being adjacent. In the structure of zif268 with the wild-type DNA there is no hydrogen bound between the position +3 E and the C base-pair, but rather it interacts with the backbone and with the neighboring R amino acid <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B1">1</abbr></abbrgrp>. Various qualitative codes for the interactions of this protein family do not include E as an acceptable amino acid at position +3 <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B15">15</abbr></abbrgrp>. But in the compilation of SELEX and phage-display results used by Benos <it>et al </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, the combination of RE-CG was much more frequent than expected from the individual or pair occurrences (p-value less than 0.001). That is consistent with our result that in general E contributes little to the specificity of the binding site at position 8 except in the case where the adjacent interaction is R with G or A. Such context dependencies are not included in the simple recognition code models, but we can easily add that to the basic model. In Figure <figr fid="F4">4</figr> we show two different specificity tables for the interaction of positions +3 and 8. Figure <figr fid="F4">4A</figr> represents the general case, and Figure <figr fid="F4">4B</figr> is for the special case of R with G or A at positions -1 and 9. If we now predict the complete data using these models, combined with the general model for positions -1 and 9 in Figure <figr fid="F3">3B</figr>, we obtain the values shown in Figure <figr fid="F4">4C</figr>. The specificity information of this data is 1.44 bits, showing that it models quite accurately the complete data. The correlation coefficient for those predicted binding energies with the measured energies is 0.96, a significant improvement over the model without the context dependent parameters (Model 2 of Figure <figr fid="F5">5</figr>). This improvement is at the cost of only three additional parameters due to the separation into two distinct classes depending on whether or not position -1 is an R that interacts with G or A. The completely additive model has 8 free parameters for the interaction of positions -1 and 9 (the 9 values in Figure <figr fid="F3">3B</figr> minus 1 for the total fixed sum) and 3 free parameters for the interaction of positions +3 and 8 (from the 4 values in Figure <figr fid="F3">3C</figr>). By separating the matrix of Figure <figr fid="F3">3C</figr> into two separate cases, shown in Figure <figr fid="F4">4A,B</figr>, we need 3 additional parameters in the model, for a total of 14. The model is used to predict data with 35 free values (the 36 elements of Figure <figr fid="F3">3A</figr> minus 1 for the fixed sum), so the additional parameters are only a small reduction in the degrees of freedom remaining to assess the fitness of the model.</p>
         <fig id="F5">
            <title>
               <p>Figure 5</p>
            </title>
            <caption>
               <p>Scatter plot of the observed (Figure 3A) and predicted binding probabilities</p>
            </caption>
            <text>
               <p>Scatter plot of the observed (Figure 3A) and predicted binding probabilities. Model2 is the two component model, so those points show the fit between Figure 3A and Figure 4C. Model1 is for the single component model obtained from the outer product of Figure 3B and Figure 3C (table of predicted probabilities not shown).</p>
            </text>
            <graphic file="1471-2105-6-176-5"/>
         </fig>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>DNA binding specificities with the two component model</p>
            </caption>
            <text>
               <p>DNA binding specificities with the two component model. <b>A: </b>The 2D table of interactions for amino acid position 3 with base position 8 obtained from the data in Figure 3A for all cases except R with G or A (and normalized to a sum of 1000). <b>B: </b>The 2D table of interactions for amino acid position 3 with base position 8 for the cases with R and G or A (normalized to 1000). <b>C: </b>The predicted binding probabilities for the entire dataset using the two component model. The elements for the cases of R with G or A are obtained by the outer product of the matrix from <b>B </b>with the R/G,A elements of the matrix in Figure 3B. The rest of the elements are obtained from the outer product of <b>A </b>with the remaining elements of the matrix from Figure 3B.</p>
            </text>
            <graphic file="1471-2105-6-176-4"/>
         </fig>
         <p>The EGR family of proteins is an ideal case to study the effectiveness of a recognition code for protein-DNA interactions. The collection of crystal structures along with a large number of examples from selection experiments provides a wealth of information for determining the relationship between the protein sequence and the affinity for different DNA sequences. Simple qualitative models that predict the preferred interactions can be very effective and useful for designing new TFs <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B19">19</abbr></abbrgrp>. Quantitative models, that predict relative binding affinities to multiple DNA sites, are more challenging but some success has been achieved by statistical approaches as well as by structure based approaches <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. Most current models of this type assume independence of the contributions to binding between the positions in the interactions. In this work we show that additive models can be a good approximation for any particular EGR protein and also for binding to any particular DNA site; additivity holds well for both the DNA and protein side of the interaction. But we also show that there is not a universal set of parameters that work for all proteins or all DNA sites, rather there is context dependence in the interactions. However, at least in the cases studied here, a simple addition to the independent model that divides sites into two classes provides a much better fit. This holds promise that, even though additivity does not hold precisely, it may still be possible to determine an additive recognition code by identifying a small set of classes that cover the entire set of interactions. How many classes will be needed is unknown at this time. The 36 combinations in our study required only two classes to give a very good fit but this is still far from a comprehensive analysis. The total number of adjacent amino acid pairs is 400 and the number of di-nucleotide combinations is 16, so there are 6400 possible combinations of the two. Quantitative analyses that cover all possible combinations of even a single zinc finger are impossible at this time. But more thorough sampling of the space of high affinity interactions, followed by quantitative binding assays, will provide much valuable information regarding the nature of recognition codes. While a completely additive model for the interaction of the protein and DNA is not correct, it may be that only relatively minor modifications are needed to make significantly better predictions.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>By determining the binding affinities of single and double mutants in both the DNA binding site and in the protein we were able to assess the degree of additivity in both halves of the interaction. Although only a limited number of combinations were tested, we find that for every DNA sequence and for every protein sequence an additive model is a good approximation to the real binding data. However, when all of the data are considered together there are clear context effects that are not well fit by a single additive model. A slightly more complex model does provide a good fit to the observed data, suggesting that quite simple may still be employed to predict quantitative binding interactions of proteins with DNA. Further data are needed to determine how well these findings generalize to more variations and to other protein families.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Construction of wild-type zif268 DNA binding domain (DBD) and its variants</p>
            </st>
            <p>A plasmid containing the DNA binding domain of wild-type zif268 was obtained from Gendaq Limited <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. The portion of zif268 cDNA encoding the three zinc-finger DBD (cDNA nucleotides 996&#8211;1262, amino acids 331&#8211;420) was amplified by PCR and subcloned into expression vector pET-28a-c(+) (Novagen) to create His-tagged fusion protein. The resulting construct, denoted pzif268, was verified by DNA sequencing. Five zif268 mutants with alterations in the base-contacting residues in finger one of zif268 DBD were constructed with QuikChange&#8482; XL site-directed mutagenesis Kit (Stratagene) using pzif268 as a template: 3 single substitution mutants R18Q, R18D, E21N, and two double substitution mutants R18Q/E21N and R18D/E21N. The mutagenic primers containing the desired mutations used to create the five mutants are shown in Table <tblr tid="T1">1</tblr>. The resulting plasmids p18Q, p18D, p21N, p18Q21N and p18D21N were verified by DNA sequencing. Hereafter, the proteins are referred to by their amino acids at positions -1 and +3: RE (wild-type), QE, DE, RN, QN and DN.</p>
         </sec>
         <sec>
            <st>
               <p>Expression and purification of His-tagged-zif268 fusion protein and its variants</p>
            </st>
            <p><it>E. coli </it>BL21 cells bearing pzif268 or one of its derivatives were grown in 2xYT medium at 37&#176;C with constant shaking at 250 rpm. IPTG was added to a final concentration of 1 mM when OD<sub>600 </sub>reached 0.6&#8211;1.0. Cells were harvested 3 hrs after IPTG induction by centrifugation at 4000 rpm for 20 min. The pellets were then resuspended in 15 ml of lysis buffer (50 mM Tris-HCl pH 8.0, 300 mM NaCl, 10 mM DDT and 1 tablet of protease inhibitor cocktail tablets (Roche) and lysed with sonication. The pellets were then separated by centrifugation at 6000 rpm for 20 min and insoluble material removed. The His-tagged fusion protein was purified with Ni-resin chromatography similar to those described previously <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. The elutions were collected as 2 ml fractions. Fractions were analyzed on 12% SDS-PAGE gel, followed by silver staining. Finally the fractions were pooled and dialysed against dialysis buffer (30 mM Tris-HCl pH 8.0, 50 mM NaCl, 3 mM DTT) at 4&#176;C, followed by concentration with a Centricon filter (Amicon) and kept at -80&#176;C until usage. The protein concentration was determined with BioRad assay kit.</p>
         </sec>
         <sec>
            <st>
               <p>Multiple quantitative fluorescence relative affinity (QuMFRA) assay to determine the relative binding constants</p>
            </st>
            <p>The relative binding constants of each protein to different binding sites were determined by the QuMFRA assay <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> with some modifications. Double-strand oligonucleotide binding sites used in this study were generated by PCR reactions. In each PCR reaction, a synthesized oligo containing either the wild-type binding site (zif1) of zif268 or one of its variants (Table <tblr tid="T1">1</tblr>) was used as template and the two primers are KS and SK (Table <tblr tid="T1">1</tblr>). The SK primer was labeled with one of the following four fluorophores: FAM, HEX, TAMRA, or ROX <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The PCR products were dissolved in TS buffer (10 mM Tris-HCl pH 8.0, 50 mM NaCl) after purification and precipitated with 1/10 vol of 3M NaAc and equal volume of isopropanol. The concentration of DNA was determined using a method similar to those as described previously <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
            <p>The competitive binding assay <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> was performed by mixing 4 different fluorophore-labeled DNA binding sites with a certain amount of His-tagged zinc finger protein in 1x reaction buffer (30 mM Tris-HCl pH 8.0, 50 mM NaCl, 0.1 mg/ml BSA, 3 mM DTT, 20 uM ZnSO4, polydI-dC 5 ug/ml), in which the fluorophore-labeled zif1 served as an internal reference in each reaction. The reaction was equilibrated for 1 hr on ice before being electrophoresed on a 10% polyacrylamide gel. Each of 4 fluorophore-labeled PCR products was also loaded individually onto the same gel. After electrophoresis, the gels were scanned by a Typhoon Variable Scanner (Molecular Dynamics, Sunnyvale, CA) to obtain the fluorescent intensities of the separated bands (bound and unbound) at 4 different emission wavelengths using the same machine settings as employed by Man and Stormo <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. For each separated band, the resultant fluorescence intensities at four emission wavelengths make up the output vector <graphic file="1471-2105-6-176-i5.gif"/>. Using the fluorescence intensities of the 4 individual fluorophore-labeled DNA at each emission wavelength we obtain the emission matrix <it>E </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The input mixture of the 4 DNAs in each band, represented as the vector <graphic file="1471-2105-6-176-i6.gif"/>, were computed by a program developed for this study using the Gaussian elimination algorithm from the following relationship:</p>
            <p>
               <graphic file="1471-2105-6-176-i7.gif"/>
            </p>
            <p>From the amount of each DNA in the bound and unbound bands of each lane, the relative binding affinity can be calculated by the following formula, where the wild-type binding site of zif268 (zif1) serves as the reference:</p>
            <p><it>K</it><sub><it>b test</it></sub>/<it>K</it><sub><it>b ref </it></sub>= <it>[P&#183;D]</it><sub><it>test</it></sub><it>[D]</it><sub><it>ref</it></sub>/<it>[D]</it><sub><it>test</it></sub><it>[P&#183;D]</it><sub><it>ref</it></sub></p>
            <p><it>K</it><sub><it>b test</it></sub>/<it>K</it><sub><it>b ref </it></sub>= <it>I</it><sub><it>P</it>-<it>Dtest</it></sub><it>I</it><sub><it>Dref</it></sub>/<it>I</it><sub><it>Dtest</it></sub><it>I</it><sub><it>P</it>-<it>Dref</it></sub></p>
            <p>where <it>I</it><sub><it>P</it>-<it>D </it></sub>and <it>I</it><sub><it>D </it></sub>are the intensities of the specified DNAs in the bound and unbound bands, respectively.</p>
         </sec>
         <sec>
            <st>
               <p>Determination of the absolute binding constant of a zinc finger protein to a binding site by Scatchard analysis</p>
            </st>
            <p>Scatchard analysis <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> was applied here to examine the absolute association constant, K<sub>a</sub>, of a zinc finger protein to a binding site. Specifically, a fixed amount of purified His-tagged zinc finger protein, [P]<sub>total</sub>, was mixed with increasing Cy5-labeled DNA generated by PCR reactions in 1x reaction buffer for 1 hr on ice. The bound and unbound DNA were separated by electrophoresis on a10% polyacrylamide gel, as above, and the gels were scanned by a Typhoon Variable Scanner using the excitation wavelength of 633 nm and emission wavelength of 670 nm. From the following relationship</p>
            <p>
               <graphic file="1471-2105-6-176-i8.gif"/>
            </p>
            <p>it can be seen that the association constant for the particular combination of protein and DNA, K<sub><it>a</it></sub>(<it>P</it>,<it>D</it>), can be obtained from a plot of <graphic file="1471-2105-6-176-i9.gif"/> at multiple DNA concentrations. At least five independent determinations were made for each protein.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>JL performed all of the experiments, which GS helped to design. Both authors contributed to the analysis of the data and the writing of the paper.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Gendaq for giving us DNA phage coding for zif268. We thank Takis Benos for help with subcloning and David Granas for some statistical analyses of the SELEX and phage-display data. This work was supported by NIH grant GM28755.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A</p>
            </title>
            <aug>
               <au>
                  <snm>Pavletich</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1991</pubdate>
            <volume>252</volume>
            <fpage>809</fpage>
            <lpage>17</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2028256</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Zif268 protein-DNA complex refined at 1.6 A: a model system for understanding zinc finger-DNA interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Elrod-Erickson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rould</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Nekludova</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1996</pubdate>
            <volume>4</volume>
            <fpage>1171</fpage>
            <lpage>80</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(96)00125-6</pubid>
                  <pubid idtype="pmpid">8939742</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Elrod-Erickson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>TE</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1998</pubdate>
            <volume>6</volume>
            <fpage>451</fpage>
            <lpage>64</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(98)00047-1</pubid>
                  <pubid idtype="pmpid">9562555</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Physical basis of a protein-DNA recognition code</p>
            </title>
            <aug>
               <au>
                  <snm>Choo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Klug</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>117</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(97)80015-2</pubid>
                  <pubid idtype="pmpid">9032060</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>DNA recognition by Cys2His2 zinc finger proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Nekludova</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Annu Rev Biophys Biomol Struct</source>
            <pubdate>2000</pubdate>
            <volume>29</volume>
            <fpage>183</fpage>
            <lpage>212</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.biophys.29.1.183</pubid>
                  <pubid idtype="pmpid" link="fulltext">10940247</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Toward rules relating zinc finger protein sequences and DNA binding site preferences</p>
            </title>
            <aug>
               <au>
                  <snm>Desjarlais</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1992</pubdate>
            <volume>89</volume>
            <fpage>7345</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">49706</pubid>
                  <pubid idtype="pmpid" link="fulltext">1502144</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Redesigning the DNA-binding specificity of a zinc finger protein: a data base-guided approach</p>
            </title>
            <aug>
               <au>
                  <snm>Desjarlais</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1992</pubdate>
            <volume>12</volume>
            <fpage>101</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340120202</pubid>
                  <pubid idtype="pmpid">1603798</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Use of a zinc-finger consensus sequence framework and specificity rules to design specific DNA binding proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Desjarlais</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1993</pubdate>
            <volume>90</volume>
            <fpage>2256</fpage>
            <lpage>60</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">46065</pubid>
                  <pubid idtype="pmpid" link="fulltext">8460130</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Selection of DNA binding sites for zinc fingers using rationally randomized DNA reveals coded interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Choo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Klug</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1994</pubdate>
            <volume>91</volume>
            <fpage>11168</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">45188</pubid>
                  <pubid idtype="pmpid" link="fulltext">7972028</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Toward a code for the interactions of zinc fingers with DNA: selection of randomized fingers displayed on phage</p>
            </title>
            <aug>
               <au>
                  <snm>Choo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Klug</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1994</pubdate>
            <volume>91</volume>
            <fpage>11163</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">45187</pubid>
                  <pubid idtype="pmpid" link="fulltext">7972027</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Length-encoded multiplex binding site determination: application to zinc finger proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Desjarlais</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Berg</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1994</pubdate>
            <volume>91</volume>
            <fpage>11099</fpage>
            <lpage>103</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">45174</pubid>
                  <pubid idtype="pmpid" link="fulltext">7972017</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Zinc finger phage: affinity selection of fingers with new DNA-binding specificities</p>
            </title>
            <aug>
               <au>
                  <snm>Rebar</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1994</pubdate>
            <volume>263</volume>
            <fpage>671</fpage>
            <lpage>3</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8303274</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Artificial zinc finger peptides: creation, DNA recognition, and gene regulation</p>
            </title>
            <aug>
               <au>
                  <snm>Nagaoka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sugiura</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Inorg Biochem</source>
            <pubdate>2000</pubdate>
            <volume>82</volume>
            <fpage>57</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0162-0134(00)00154-9</pubid>
                  <pubid idtype="pmpid">11132639</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Design and selection of novel Cys2His2 zinc finger proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Peisach</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Grant</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>2001</pubdate>
            <volume>70</volume>
            <fpage>313</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.biochem.70.1.313</pubid>
                  <pubid idtype="pmpid" link="fulltext">11395410</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Analysis of zinc fingers optimized via phage display: evaluating the utility of a recognition code</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Greisman</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Ramm</snm>
                  <fnm>EI</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>285</volume>
            <fpage>1917</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.2421</pubid>
                  <pubid idtype="pmpid" link="fulltext">9925775</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Stereochemical basis of DNA recognition by Zn fingers</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yagi</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>3397</fpage>
            <lpage>405</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">523735</pubid>
                  <pubid idtype="pmpid">8078776</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Geometric analysis and comparison of protein-DNA interfaces: why is there no simple code for recognition?</p>
            </title>
            <aug>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Nekludova</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>301</volume>
            <fpage>597</fpage>
            <lpage>624</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.3918</pubid>
                  <pubid idtype="pmpid" link="fulltext">10966773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Is there a code for protein-DNA recognition? Probab(ilistical)ly..</p>
            </title>
            <aug>
               <au>
                  <snm>Benos</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Lapedes</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioessays</source>
            <pubdate>2002</pubdate>
            <volume>24</volume>
            <fpage>466</fpage>
            <lpage>75</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bies.10073</pubid>
                  <pubid idtype="pmpid" link="fulltext">12001270</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Validated zinc finger protein designs for all 16 GNN DNA triplet targets</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Xia</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Zhong</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Case</snm>
                  <fnm>CC</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <fpage>3850</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M110669200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11726671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Probabilistic code for DNA recognition by proteins of the EGR family</p>
            </title>
            <aug>
               <au>
                  <snm>Benos</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Lapedes</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>323</volume>
            <fpage>701</fpage>
            <lpage>27</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(02)00917-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">12419259</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Analyzing protein-DNA recognition mechanisms</p>
            </title>
            <aug>
               <au>
                  <snm>Paillard</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lavery</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Structure (Camb)</source>
            <pubdate>2004</pubdate>
            <volume>12</volume>
            <fpage>113</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.str.2003.11.022</pubid>
                  <pubid idtype="pmpid" link="fulltext">14725771</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>DNA recognition code of transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yagi</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1995</pubdate>
            <volume>8</volume>
            <fpage>319</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7567917</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Mandel-Gutfreund</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Margalit</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>2306</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147552</pubid>
                  <pubid idtype="pmpid" link="fulltext">9580679</pubid>
                  <pubid idtype="doi">10.1093/nar/26.10.2306</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Intermolecular and intramolecular readout mechanisms in protein-DNA recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Siebers</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Selvaraj</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kono</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sarai</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>337</volume>
            <fpage>285</fpage>
            <lpage>94</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.01.033</pubid>
                  <pubid idtype="pmpid" link="fulltext">15003447</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Structure-based prediction of DNA target sites by regulatory proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Kono</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sarai</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1999</pubdate>
            <volume>35</volume>
            <fpage>114</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0134(19990401)35:1&lt;114::AID-PROT11>3.0.CO;2-T</pubid>
                  <pubid idtype="pmpid" link="fulltext">10090291</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Evaluation of free energy landscape for base-amino acid interactions using ab initio force field and extensive sampling</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nishimura</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Aida</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pichierri</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Sarai</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biopolymers</source>
            <pubdate>2002</pubdate>
            <volume>61</volume>
            <fpage>84</fpage>
            <lpage>95</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11891631</pubid>
                  <pubid idtype="doi">10.1002/1097-0282(2001)61:1&lt;84::AID-BIP10045>3.0.CO;2-X</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay</p>
            </title>
            <aug>
               <au>
                  <snm>Man</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>2471</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">55749</pubid>
                  <pubid idtype="pmpid" link="fulltext">11410653</pubid>
                  <pubid idtype="doi">10.1093/nar/29.12.2471</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>PL</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>1255</fpage>
            <lpage>61</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">101241</pubid>
                  <pubid idtype="pmpid" link="fulltext">11861919</pubid>
                  <pubid idtype="doi">10.1093/nar/30.5.1255</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Additivity in protein-DNA interactions: how good an approximation is it?</p>
            </title>
            <aug>
               <au>
                  <snm>Benos</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>4442</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">137142</pubid>
                  <pubid idtype="pmpid" link="fulltext">12384591</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf578</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Quantitative modeling of DNA-protein interactions: effects of amino acid substitutions on binding specificity of the Mnt repressor</p>
            </title>
            <aug>
               <au>
                  <snm>Man</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>4026</fpage>
            <lpage>32</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">506813</pubid>
                  <pubid idtype="pmpid" link="fulltext">15289576</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh729</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Bulyk</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Whitmore</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Biometrics</source>
            <pubdate>2002</pubdate>
            <volume>58</volume>
            <fpage>981</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.0006-341X.2002.00981.x</pubid>
                  <pubid idtype="pmpid">12495153</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Quantitative analysis of the relationship between nucleotide sequence and functional activity</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Gold</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1986</pubdate>
            <volume>14</volume>
            <fpage>6661</fpage>
            <lpage>79</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">311672</pubid>
                  <pubid idtype="pmpid">3092188</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Binding studies with mutants of Zif268. Contribution of individual side chains to binding affinity and specificity in the Zif268 zinc finger-DNA complex</p>
            </title>
            <aug>
               <au>
                  <snm>Elrod-Erickson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1999</pubdate>
            <volume>274</volume>
            <fpage>19281</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.274.27.19281</pubid>
                  <pubid idtype="pmpid" link="fulltext">10383437</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Rearrangement of side-chains in a Zif268 mutant highlights the complexities of zinc finger-DNA recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Miller</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>313</volume>
            <fpage>309</fpage>
            <lpage>15</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2001.4975</pubid>
                  <pubid idtype="pmpid" link="fulltext">11800559</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Beyond the "recognition code": structures of two Cys2His2 zinc finger/TATA box complexes</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfe</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Grant</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Elrod-Erickson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Structure (Camb)</source>
            <pubdate>2001</pubdate>
            <volume>9</volume>
            <fpage>717</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(01)00632-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">11587646</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Dramatic changes in DNA-binding specificity caused by single residue substitutions in an Arc/Mnt hybrid repressor</p>
            </title>
            <aug>
               <au>
                  <snm>Raumann</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Sauer</snm>
                  <fnm>RT</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1995</pubdate>
            <volume>2</volume>
            <fpage>1115</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nsb1295-1115</pubid>
                  <pubid idtype="pmpid">8846224</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Specificity of Mnt 'master residue' obtained from in vivo and in vitro selections</p>
            </title>
            <aug>
               <au>
                  <snm>Silbaq</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Ruttenberg</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>5539</fpage>
            <lpage>48</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140065</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490722</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf684</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Rapid, high-throughput engineering of sequence-specific zinc finger DNA-binding proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Isalan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Choo</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2001</pubdate>
            <volume>340</volume>
            <fpage>593</fpage>
            <lpage>609</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11494872</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The ClpX protein of <it>Bacillus subtilis </it>indirectly influences RNA polymerase holoenzyme composition and directly stimulates sigma-dependent transcription</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zuber</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2000</pubdate>
            <volume>37</volume>
            <fpage>885</fpage>
            <lpage>97</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2000.02053.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10972809</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Measurement of nucleic acid concentrations using the DyNA Quant and the GeneQuant</p>
            </title>
            <aug>
               <au>
                  <snm>Teare</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Islam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Flanagan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gallagher</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Grabau</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Biotechniques</source>
            <pubdate>1997</pubdate>
            <volume>22</volume>
            <fpage>1170</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9187773</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Comparison of the DNA binding characteristics of the related zinc finger proteins WT1 and EGR1</p>
            </title>
            <aug>
               <au>
                  <snm>Hamilton</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Borel</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Romaniuk</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1998</pubdate>
            <volume>37</volume>
            <fpage>2051</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi9717993</pubid>
                  <pubid idtype="pmpid" link="fulltext">9485332</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Specificity, free energy and information content in protein-DNA interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Fields</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1998</pubdate>
            <volume>23</volume>
            <fpage>109</fpage>
            <lpage>13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0968-0004(98)01187-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">9581503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Sequence logos: a new way to display consensus sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1990</pubdate>
            <volume>18</volume>
            <fpage>6097</fpage>
            <lpage>100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">332411</pubid>
                  <pubid idtype="pmpid">2172928</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Displaying the information contents of structural RNA alignments: the structure logos</p>
            </title>
            <aug>
               <au>
                  <snm>Gorodkin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Heyer</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <fpage>583</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9475985</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
