<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-8-407</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Bioinformatic identification of novel putative photoreceptor specific <it>cis</it>-elements</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Danko</snm>
               <mi>G</mi>
               <fnm>Charles</fnm>
               <insr iid="I1"/>
               <email>dankoc@gmail.com</email>
            </au>
            <au id="A2">
               <snm>McIlvain</snm>
               <mi>A</mi>
               <fnm>Vera</fnm>
               <insr iid="I2"/>
               <email>mcilvaiv@upstate.edu</email>
            </au>
            <au id="A3">
               <snm>Qin</snm>
               <fnm>Maochun</fnm>
               <insr iid="I1"/>
               <email>qinm@upstate.edu</email>
            </au>
            <au id="A4">
               <snm>Knox</snm>
               <mi>E</mi>
               <fnm>Barry</fnm>
               <insr iid="I2"/>
               <email>knoxb@upstate.edu</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Pertsov</snm>
               <mi>M</mi>
               <fnm>Arkady</fnm>
               <insr iid="I1"/>
               <email>pertsova@upstate.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Pharmacology, SUNY Upstate Medical University, Syracuse, NY, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Biochemistry &amp; Molecular Biology and Ophthalmology, SUNY Upstate Medical University, Syracuse, NY, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>407</fpage>
         <url>http://www.biomedcentral.com/1471-2105/8/407</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17953763</pubid>
               <pubid idtype="doi">10.1186/1471-2105-8-407</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>08</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>22</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>22</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Danko et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Cell specific gene expression is largely regulated by different combinations of transcription factors that bind <it>cis</it>-elements in the upstream promoter sequence. However, experimental detection of <it>cis</it>-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify <it>cis</it>-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding sites involved in regulating the differences between murine rod and cone photoreceptor populations.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>To identify highly conserved motifs enriched in promoters that drive expression in either rod or cone photoreceptors, we assembled a set of murine rod-specific, cone-specific, and non-photoreceptor background promoter sequences. These sets were used as input to a newly devised motif discovery algorithm called Iterative Alignment/Modular Motif Selection (IAMMS). Using IAMMS, we predicted 34 motifs that may contribute to rod-specific (19 motifs) or cone-specific (15 motifs) expression patterns. Of these, 16 rod- and 12 cone-specific motifs were found in clusters near the transcription start site. New findings include the observation that cone promoters tend to contain TATA boxes, while rod promoters tend to be TATA-less (exempting <it>Rho </it>and <it>Cnga1</it>). Additionally, we identify putative sites for IL-6 effectors (in rods) and RXR family members (in cones) that can explain experimental data showing changes to cell-fate by activating these signaling pathways during rod/cone development. Two of the predicted motifs (NRE and ROP2) have been confirmed experimentally to be involved in cell-specific expression patterns. We provide a full database of predictions as additional data that may contain further valuable information. IAMMS predictions are compared with existing motif discovery algorithms, DME and BioProspector. We find that over 60% of IAMMS predictions are confirmed by at least one other motif discovery algorithm.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We predict novel, putative <it>cis-</it>elements enriched in the promoter of rod-specific or cone-specific genes. These are candidate binding sites for transcription factors involved in maintaining functional differences between rod and cone photoreceptor populations.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Experimental identification of DNA sequence motifs that bind specific transcription factors (<it>cis</it>-elements) and regulate gene expression are expensive, time-consuming, and difficult. This makes bioinformatic methods for identifying <it>cis</it>-elements an important tool for prioritizing future experimental studies of transcriptional regulation. Rod and cone photoreceptors each specialize in a unique function by the expression of distinct genes that perform analogous roles in each cell's light transduction pathway. Bioinformatic motif identification techniques have been used to successfully identify potential targets of 3 photoreceptor-specific transcription factors (NRL, CRX, NR2E3) using their known binding specificity <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Experimental evidence suggests that at least 9 additional transcription factors are involved in regulation of either rod- or cone-specific expression <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. However, binding motifs for many of these transcription factors are presently unknown. In this study, we use <it>de novo </it>motif discovery methods to identify motifs that may be important for gene expression differences between rod and cone photoreceptors.</p>
         <p>The most commonly used <it>de novo </it>method is phylogenetic footprinting, based on the assumption that functional sequence changes more slowly through evolution compared to the surrounding sequence. The advantage of phylogenetic footprinting is its specificity: significant conservation across many species strongly suggests that a sequence is functional. However, phylogenetic footprinting suffers from a high incidence of false negative errors <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. Alternative approaches seek to identify motifs that are over-represented compared to a set of unrelated background sequences <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. To increase the accuracy of predictions, recent over-representation motif discovery implementations incorporate additional biological information <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, such as the position of motifs relative to the transcription start site (for reviews see: <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>). Here, we use a combination of over-representation, position-based filtering, and phylogenetic analysis to select and analyze motifs that may be involved in rod and cone-specific expression patterns.</p>
         <p>Our motif discovery implementation, called <ul>i</ul>terative <ul>a</ul>lignment/<ul>m</ul>odular <ul>m</ul>otif <ul>s</ul>election (IAMMS), selects motifs based on three biological assumptions. First, we assume that promoters of functionally linked genes will share similar regulatory motifs. The second assumption is that functional motifs are concentrated near the transcription start site <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Third, we assume that occurrences of a given motif cluster near a characteristic distance from the transcription start site <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. To implement the last two assumptions, we applied a hierarchical clustering algorithm because the algorithm chooses the mode and variance of a distribution based on the underlying data. This approach advances position-based filtering over previous implementations that model motif position dependence by a static distribution given by the empirical frequency of all motifs relative to the transcription start site in bacteria <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. We implement this approach on a set of murine rod-specific, cone-specific, and background promoter sequences derived from biochemical <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp> and microarray <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B22">22</abbr></abbrgrp> studies.</p>
         <p>IAMMS identified 34 motifs enriched in the promoter of either rod or cone photoreceptors, most of which are not similar to any previously known motifs. To increase our confidence in these predictions, results obtained using IAMMS were compared to those of existing motif discovery algorithms, DME and BioProspector. We chose BioProspector because it improves on the well-studied Gibbs sampling algorithm by representing background sequences as a third-order Markov model <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B13">13</abbr></abbrgrp>. DME was chosen because it is based on the well-regarded maximum likelihood algorithm <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. This comparison revealed that over 60% of our predictions were also confirmed by at least one additional algorithm. We provide extensive discussion of these predictions in the context of the biochemical literature.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Application of IAMMS to Rod and Cone-specific Promoters</p>
            </st>
            <p>Input to IAMMS consisted of the upstream region of 11 rod-specific, 12 cone-specific, and 84 non-photoreceptor genes (see table <tblr tid="T1">1</tblr> for a list of rod/cone-specific genes, and additional file <supplr sid="S1">1</supplr> online for background genes). The flowchart of the IAMMS algorithm is shown in figure <figr fid="F1">1</figr> (see methods for details). The first step involved an iterative alignment procedure conducted on all rod, cone, and non-photoreceptor promoters. This step resulted in a dataset of 71,195 conserved motifs between 8 and 150 bp in length. Each entry of the dataset contains nucleotide sequences, the location of motif occurrences with respect to the transcription start site, strand, and promoter from which each occurrence originated. To illustrate the composition of the dataset, we plotted motif length against the number of occurrences of each motif in photoreceptor promoters (figure <figr fid="F2">2</figr>; background occurrences are not shown). The color map represents the number of motifs with each length/frequency combination. As may be expected, motif size has an inverse relationship with the number of occurrences.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Explanation of Supplementary Data</b>. Detailed information on reading HTML formatted supplementary data.</p>
               </text>
               <file name="1471-2105-8-407-S1.ZIP">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Rod-specific and cone-specific genes</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Rod:</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Rho<sup>22</sup>; Sag<sup>17</sup>; Pde6a<sup>22</sup>; Pde6g<sup>22</sup>; Pde6d<sup>20</sup>; Pde6b<sup>22</sup>; Nrl<sup>18</sup>; Nr2e3<sup>19</sup>; Gnat1<sup>22</sup>; Cnga1<sup>22</sup>; Gnb1<sup>15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Cone:</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Opn1mw<sup>15</sup>; Opn1sw<sup>15</sup>; Pde6c<sup>15</sup>; Pde6h<sup>15</sup>; Arr3<sup>15</sup>; Cngb3<sup>15</sup>; Cnga3<sup>21</sup>; Smug1<sup>15</sup>; Gnat2<sup>15</sup>; Gnb3<sup>15</sup>; Elovl2<sup>15</sup>; Gngt2<sup>15</sup></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Rod-specific and cone-specific genes whose promoters were used in this study are listed by MGI symbol. References to the article stating cell-specificity are given superscript to each gene.</p>
               </tblfn>
            </tbl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>A block diagram of the iterative alignment/modular motif selection (IAMMS) algorithm used to identify putative functional sites in photoreceptor promoter regions</p>
               </caption>
               <text>
                  <p>A block diagram of the iterative alignment/modular motif selection (IAMMS) algorithm used to identify putative functional sites in photoreceptor promoter regions. Boxes represent the input/output of each successive step. Arrows show flow. Circles show the application of a given filter.</p>
               </text>
               <graphic file="1471-2105-8-407-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>3D histogram representing features of potential motifs after the iterative alignment</p>
               </caption>
               <text>
                  <p>3D histogram representing features of potential motifs after the iterative alignment. The vertical and horizontal axis plot the number of non-overlapping occurrences of a motif, and the motif length in nucleotides (nt), respectively. Color shows the number (on a log-10 scale) of motifs with the given parameters. The box shows the approximate area that is likely to contain functional motifs. The circle shows the region containing the motif sample in Fig. 3A. Longer motifs (> 20 bp) are longer simple or interspersed repeats.</p>
               </text>
               <graphic file="1471-2105-8-407-2"/>
            </fig>
            <p>Analysis showed that the majority of motifs identified after the first step were repeat sequences. The motifs occurring most frequently (> 25 occurrences) were primarily simple repeats. All longer motifs (> 19 bp) were highly similar to microsatellites and interspersed repeats, as revealed by comparison to a database of known repeats (RepBase). Repeat sequences were filtered out at step 2.</p>
            <p>After repeat filtering, the remaining motifs, those inside and immediately above the marked box in figure <figr fid="F2">2</figr>, were evaluated for potential enrichment in rod or cone photoreceptors (step 3). Since we are interested in motifs that occur in the promoters of only one photoreceptor cell type, motifs that have occurrences in both rod and cone promoters were classified as ambiguous and were excluded from consideration during this step. To evaluate enrichment of a motif compared to background, we assume a binomial distribution of <it>k</it><sub><it>r </it></sub>rod specific (or <it>k</it><sub><it>c </it></sub>cone-specific) promoters drawn from the total number of promoters that contain occurrences. A Bonferroni correction for multiple hypothesis testing (E-value) is applied to the resulting p-value, as described in the <it>Statistical annotation </it>section of methods. The top scoring motifs identified during this step were subjected to phylogenetic analysis (step 6) and compared to known motifs using the <ul>T</ul>ranscriptional <ul>E</ul>lement <ul>S</ul>earch <ul>S</ul>ystem (TESS; step 7).</p>
            <p>Figure <figr fid="F3">3</figr> shows representative examples of top-scoring cone- and rod-enriched motifs identified during step 3, after being subjected to phylogenetic analysis, and compared to TESS. The cone-enriched motif is 13 bp in length, contains 5 occurrences in cone-specific promoters and none in rods (non-photoreceptor occurrences are not shown). The <ul>c</ul>ross <ul>s</ul>pecies <ul>c</ul>onservation <ul>s</ul>cores (CSCS) for each occurrence is shown in the last column. Four occurrences have a negative CSCS. A negative CSCS means that the predicted occurrence is more conserved than surrounding sequences of the same length (see Methods for details). Comparison with known photoreceptor-specific motifs indicated that this sequence is similar to the preferred binding site for the Retnoid X Receptor (RXR). Involvement of RXR in cone-specific expression is well established <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, but binding sites for this transcription factor in cone photoreceptor promoters have not yet been identified, making this prediction valuable for planning experimental studies.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Example of cone (A) and rod (B) enriched DNA motifs after statistical annotation</p>
               </caption>
               <text>
                  <p>Example of cone (A) and rod (B) enriched DNA motifs after statistical annotation. Columns from left to right give gene MGI Symbol, cell-specific expression patterns (C, cone; R, rod; background matches are removed for this figure), start position of motif occurrence relative to the transcription start site, strand relative to the transcription start site (+1), consensus sequence (shown on the top), and cross-species conservation score (see methods). Occurrences are sorted by distance from transcription start site. The cone motif (A) is similar to a known binding site (RXR). The rod motif (B) is similar to the c-Myb binding site. For both motifs, non-photoreceptor occurrences (n = 2, 9 for A and B, respectively) have been removed for simplicity.</p>
               </text>
               <graphic file="1471-2105-8-407-3"/>
            </fig>
            <p>The rod-enriched motif (figure <figr fid="F3">3B</figr>) is 12 bp in length and contains 6 occurrences in rod promoters. Cross-species conservation shows that <it>Pde6a</it>, <it>Gnb1</it>, and <it>Nr2e3 </it>occurrences are phylogenetically conserved (a cross-species alignment is not available for the region containing the <it>Pde6g </it>occurrence, and thus no score is reported). According to TESS, this motif is similar to a c-Myb binding site. The prediction that c-Myb may have a function unique to one type of photoreceptor is consistent with publicly available microarray data (see Methods). We found that c-Myb is between 2.6 and 7.6 fold enriched in cones compared to rod photoreceptors.</p>
            <p>After step 3, IAMMS identified a total of 6 motifs (3 rod- and 3 cone-enriched) with E &lt; 2.5. Since no position filtering was applied to identify these motifs, we refer to them as position independent. All position independent rod- and cone-enriched motifs, sorted based on E-value, are shown on the top of figure <figr fid="F4">4</figr>. The highest scoring rod prediction at the top of figure <figr fid="F4">4</figr> contains two 5 bp invariant core regions separated by two ambiguous positions (CCTTTNNGCCCT; rod-enriched position independent, row 1). The position variance of this prediction is remarkably small (&#177; 45) considering that no position-based selection was applied to identify this sequence. The top scoring cone motif contains a core region 5 bp in width (aGGGTTca). It occurs in 8/12 cone promoter sequences with no discernable bias in position. Detailed information on the position and phylogenetic conservation of each occurrence is available as additional data (files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>) online.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Extended table of information on predictions</b>. Contains cross-promoter alignments and phylogenetic alignments for each prediction, as well as the entire list of ENSEMBL IDs for genes used in the study. Please refer to "Data Supplement Instructions.doc" for detailed information.</p>
               </text>
               <file name="1471-2105-8-407-S2.DOC">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Highest scoring rod (left) and cone (right) enriched motifs returned after statistical annotation in IAMMS step 3 (position independent) and IAMMS step 5 (position dependant)</p>
               </caption>
               <text>
                  <p>Highest scoring rod (left) and cone (right) enriched motifs returned after statistical annotation in IAMMS step 3 (position independent) and IAMMS step 5 (position dependant). From the left, columns give the motif logo, the fraction of rod/cone specific occurrences, cell-specificity E-value, mean location relative to the transcription start site (bp), mean phylogenetic conservation score, and similarity to known motifs. The table is sorted based on the fraction of cell-specific sequence (E-value). Note that predictions with similar core sequences are represented by the prediction with the highest E-value in figure 4. All predictions are presented individually in figure 7.</p>
               </text>
               <graphic file="1471-2105-8-407-4"/>
            </fig>
            <p>Those motifs classified as ambiguous during step 3 were subjected to position-based clustering (step 4). As described previously, we acted under the hypothesis that occurrences of a motif near the transcription start site, and those occurring in clusters around a preferred position, are more likely to be functional. One example of clusters selected by the hierarchical clustering algorithm is shown in figure <figr fid="F5">5A</figr>. This particular motif contains 55 occurrences, plotted as triangles based on their 1-dimensional position relative to the transcription start site. These occurrences are broken into clusters by the algorithm, denoted by blue ovals. A cone-enriched cluster just upstream of the transcription start site is shown in pink. This cluster contains 5/12 occurrences from cone-specific promoters, and only 4/84 occurrences in non-photoreceptor promoter regions.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>(A) Occurrences of a sample ambiguous motif (triangles) analyzed using position cluster discovery</p>
               </caption>
               <text>
                  <p>(A) Occurrences of a sample ambiguous motif (triangles) analyzed using position cluster discovery. The horizontal axis represents position relative to the putative transcription start site. The vertical position of occurrences was offset to ease viewing. Position clusters (ovals) were identified using agglomerative hierarchical clustering for all occurrences of each motif in the 2 kbp upstream region identified. Clusters with occurrences in the first 400 bp relative to the transcription start site were evaluated for cell-specificity. In this case, the cluster of occurrences nearest the transcription start site is cone-enriched. A second cluster between -250 and -500 is entirely ambiguous. The numbers k<sub>r</sub>, k<sub>c</sub>, and k<sub>n </sub>reflect the number of rod, cone, and background promoters that contain the motif. (B-C) Identification of cell-specific motifs among position-enriched clusters by statistical annotation. The vertical and horizontal axes plot the fraction of rod (B) or cone (C) promoters against the total number of promoters that contain at least one occurrence of a putative motif. Colors are assigned by the number of motifs with a given fraction (log-10 scale). The shaded region represents groups chosen using a p &lt; 0.005 cutoff threshold.</p>
               </text>
               <graphic file="1471-2105-8-407-5"/>
            </fig>
            <p>After motifs were broken into position-dependant clusters, we used the same statistical procedure described above to select those clusters enriched in rod or cone promoters (IAMMS, step 5). Figure <figr fid="F5">5B&#8211;C</figr> plots the ratio between cell-specific and total occurrences (vertical axis) against the total number of promoters with at least one occurrence (horizontal axis). Points are colored based on the number of motifs with a given combination, in a similar manner to figure <figr fid="F2">2</figr>. The cone-enriched cluster cAGAAG shown in figure <figr fid="F5">5A</figr> is one of the motifs represented by the point marked in figure <figr fid="F5">5C</figr>. This point lies just inside the gray region representing a statistical threshold of p = 0.005 that was used to classify motifs as enriched in rod (or cone) specific promoter sequences. A motif corresponding to the known cone-specific <it>cis</it>-element ROP2 is also represented by a point in the gray region of figure <figr fid="F5">5C</figr>. Figure <figr fid="F5">5B</figr> shows the same representation as figure <figr fid="F5">5C</figr> for rod-specific motifs. A previously characterized rod-specific motif, NRE, is represented by a point that lies just inside the gray region (marked in figure <figr fid="F5">5A</figr>), indicating the biological relevance of motifs represented in this region.</p>
            <p>A detailed view of the NRE-like motif identified after step 5 is shown in the left panel of figure <figr fid="F6">6</figr>. The predicted motif contains a core region (aTGCTGa). The occurrence in the <it>Rho </it>promoter at -88 bp (occurrences are enumerated below the logo in figure <figr fid="F6">6</figr>) has already been validated experimentally <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Two sample cross-species phylogenetic alignments are shown below the functional alignment in figure <figr fid="F6">6</figr> (<it>Pde6b </it>and <it>Rho</it>). In this case, these occurrences are very highly conserved relative to the surrounding sequence.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Predicted rod (left) and cone (right) enriched motifs</p>
               </caption>
               <text>
                  <p>Predicted rod (left) and cone (right) enriched motifs. Notations are the same as figure 3. Cross-species alignments for <it>Pde6b</it>, <it>Rho </it>(left), <it>Cnga3</it>, and <it>Opn1mw </it>(right) occurrences are shown on the bottom. All occurrences are highly conserved across species (CSCS -1.66, -1.87, -0.93, and -1.79). The rod-specific prediction is similar to the known rod-motif NRE. The cone-specific motif contains a previously known binding site (ROP2) for which it predicts additional occurrences. Non-photoreceptor occurrences have been removed for simplicity (See additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>).</p>
               </text>
               <graphic file="1471-2105-8-407-6"/>
            </fig>
            <p>Another known transcription factor binding site detected in this study corresponds to the recently discovered cone-specific sequence ROP2, shown in the right panel of figure <figr fid="F6">6</figr>. This prediction contains an occurrence in the <it>Opn1mw </it>promoter that was recently discovered to be required for cone-specific expression <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Previously unknown occurrences of ROP2 were predicted in the promoter of <it>Opn1sw</it>,<it>Smug1</it>, and <it>Cnga3</it>. The newly-discovered occurrence in the <it>Opn1sw </it>promoter shows remarkable position-conservation relative to the transcription start site when compared with the known <it>Opn1mw </it>occurrence: -94 and -97 bp, respectively, a difference of only 3 bp. Selected phylogenetic alignments (figure <figr fid="F6">6</figr>, right panel, bottom) show that the occurrences in the <it>Cnga3 </it>and <it>Opn1mw </it>promoters are highly conserved through evolution. In addition to increasing confidence in predictions, the ROP2 detection also provides exciting new targets for a <it>cis</it>-element that is pertinent for cone-specificity.</p>
            <p>The 12 highest scoring (E-value) rod- and cone-enriched position dependent predictions are shown on the bottom of figure <figr fid="F4">4</figr>. The example given in figure <figr fid="F5">5A</figr> (cAGAAG) can be found among cone-enriched motifs in row 7. Among the high scoring motifs, 6 rod and 3 cone predictions are similar to known motifs whose specific binding positions (with the exception of NRE) are not known, including four putative initiator (INR-like) elements, NRE, an IL-6 effector, an RXR binding site, ROP2, a putative TATA-like motif, and an Engrailed homeodomain binding site. Phylgoenetic conservation is relatively high for several of the elements, including two conservation scores less than -1 for cone-enriched predictions (TATA-like: -1.08 and En2: -1.35). As we show in the next section, many of these motifs are corroborated by motifs predicted by DME and/or BioProspector.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison with DME and BioProspector</p>
            </st>
            <p>To increase confidence in our predictions, we compared motifs discovered using IAMMS to those discovered using existing <it>de novo </it>motif discovery algorithms, DME and BioProspector. For both of these algorithms, a smaller section of the upstream region was employed (500 bp of upstream sequence and 100 bp of UTR) for a more similar comparison to IAMMS position clustering implementation. In order to return useful results, promoter regions needed to be repeat masked prior to analysis. Since the rod- and cone-specific sets are too small to be compared directly against each other, cone promoters were compared against the combined set of background and rod promoters to evaluate cone-enrichment. The same approach was used to identify rod-enriched predictions.</p>
            <p>The top 10 motifs for each motif length between 6 and 10 bp (DME) or 6 and 12 bp (BioProspector) were compared with the top IAMMS predictions. This comparison is shown in Figure <figr fid="F7">7</figr>. Predictions made by IAMMS and confirmed by DME or BioProspector are highlighted in yellow (DME), blue (BioProspector), or orange (both DME and BioProspector). It is interesting to note that rod predictions for DME and BioProspector were in agreement with IAMMS a much higher proportion of the time (nearly 80%) compared to cone predictions (just under 50%). This difference between the numbers results from a much lower rate of agreement between IAMMS and BioProspector in cone sequences. Compared to BioProspector, the rate of agreement between IAMMS and DME in rods and cones is similar (47% in cones, 57% in rods). We conclude that although they use different underlying algorithms, results obtained using DME are more similar to IAMMS compared with BioProspector.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Comparison of rod (top) and cone (bottom) specific predictions made by IAMMS to those made by either DME (yellow), BioProspector (blue), or both DME and BioProspector (orange)</p>
               </caption>
               <text>
                  <p>Comparison of rod (top) and cone (bottom) specific predictions made by IAMMS to those made by either DME (yellow), BioProspector (blue), or both DME and BioProspector (orange). For each prediction, the consensus sequence is given using IUPAC ambiguity codes (given in the legend). The following columns represent the ratio of rod to cone occurrences, the E-value of cell-specificity, the mean occurrence position in the promoter relative to the transcription start site, the mean cross-species conservation score, and similarity to any well-known transcription factor binding sites.</p>
               </text>
               <graphic file="1471-2105-8-407-7"/>
            </fig>
            <p>Overall, of 40 rod- and cone-specific predictions, 25 (over 60%) are confirmed by either DME or BioProspector and 11 (nearly 30%) were confirmed by both. Major predictions, including the ROP2 binding site, Initiator, TATA-like, and IL-6 (discussed in detail below) were corroborated by at least one motif discovery algorithm. The initiator-like and TATA-like predictions were identified by all 3 algorithms, increasing our confidence in these predictions.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>In this article, we use a combination of motif discovery algorithms to identify putative <it>cis</it>-elements that may be responsible for differences in gene expression between rod and cone photoreceptors. We identified 34 conserved motifs highly enriched in either rod or cone photoreceptor genes. Our predictions can be divided into three distinct groups:</p>
         <p>1. Completely new motifs that bare no resemblance to known transcription factor binding sites. This first group contains 20 motifs, most of which are confirmed by at least two discovery algorithms, or have a high degree of phylogenetic conservation.</p>
         <p>2. Motifs similar to <it>cis</it>-elements with known photoreceptor function. This second group contains 5 motifs, including motifs that have been characterized by previous experimental studies (NRE, ROP2) as well as motifs whose putative binding sites are unknown (RXR, En2, and IL-6 effectors). It is notable that all these motifs were derived without using any specific <it>a priori </it>knowledge.</p>
         <p>3. Motifs similar to known <it>cis</it>-elements whose involvement in photoreceptor function has not yet been established. This final group includes the TATA-like and Initiator-like sequences enriched in cone and rod promoters, respectively (see below for more details).</p>
         <sec>
            <st>
               <p>RXR and En2 binding motifs in promoters of cone-specific genes</p>
            </st>
            <p>Previous microarray studies suggest that at least 4 transcription factors (RXR&#947;, En2, Sall3, and Prdm1) are more active in cone photoreceptors than rods <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The role of RXR is supported by additional biochemical studies which demonstrate that RXR&#947; plays a vital role in patterning cone photoreceptors in response to signaling by thyroid hormone receptor &#946;2 <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. The RXR prediction is shown in figure <figr fid="F4">4</figr>, position independent, row 3. Functional RXR <it>cis-</it>elements often contain a degenerate repeat of the invariant core in close proximity <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Therefore, we examined the promoter sequences surrounding predicted RXR sites for degenerate variations of the putative core selected by IAMMS. Out of 5 sites, 4 contain an additional occurrence of G(N [0&#8211;2])TCA within 4 bp of the recognized site (see the image in additional file <supplr sid="S3">3</supplr> online). This is very unlikely to occur by chance (p~3.7 &#215; 10<sup>-4</sup>), further increasing our confidence that the predicted motif binds RXR-family transcription factors.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Additional figure 1. Region surrounding predicted IL-6 sites in 5 rod promoters. Sequences identified by IAMMS are shaded in gray; copies of the core (including the degenerate copy CTGGA) are outlined in black.</p>
               </text>
               <file name="1471-2105-8-407-S3.png">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The En2-like motif is shown in figure <figr fid="F4">4</figr> cone-enriched position dependent row 9. The prediction includes the central portion of the optimal En2 homeodomain transcription factor consensus (T<ul>AATT</ul>A) detected by <it>in vitro </it>selection experiments <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. While a corresponding motif was detected only by IAMMS (figure <figr fid="F7">7</figr>, last cone-enriched row), occurrences of the Engrailed-like prediction are highly conserved through evolution (-1.35) suggesting that the motif is functional. A similar prediction (cone position dependent, row 10) contains many of the same occurrences, but shifts the core by ~2 bp and adds an additional A to the 3' end. Like the first prediction, it is also highly conserved through evolution, centered in the same region, and cone-specific. Moreover, this second prediction was also detected by DME (figure <figr fid="F7">7</figr>, cone-enriched, 3<sup>rd </sup>row from bottom). If validated experimentally, occurrences of this AATT motif will be the first reported promoter binding sites for En2.</p>
            <p>We were unable to find binding sites for Sall3 or Prdm1 in the experimental literature. Some of the unidentified motifs predicted in this study (group 1) may correspond to binding sties for these transcription factors. Future experimental studies will be required to discover any correspondence between these transcription factors and motifs predicted in this study.</p>
         </sec>
         <sec>
            <st>
               <p>IL-6 Binding Motif in Promoters of Rod-Specific Genes</p>
            </st>
            <p>One of the rod-specific predictions, detected by both IAMMS and BioProspector, is similar to an IL-6 effector (figure <figr fid="F7">7</figr>, row marked IL-6). This is interesting in the context of recent findings that in rodents, signaling by IL-6 family members CNTF and LIF can block the formation of rod photoeceptors during development <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. According to the literature, peak IL-6 effector activity is obtained by the invariant core (CTGGGAA) and another degenerate occurrence (CTGGAA) appearing nearby <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Our prediction corresponds to the first invariant core (CTGGGA). To determine if a degenerate occurrence appeared nearby, we took rod-enriched IL-6-like predictions andsearched nearby promoter sequence to find if any elements similar to the core were present. We found that 4 rod-specific promoter sequences, including <it>Pde6b</it>, <it>Gnat1</it>, <it>Pde6d</it>, and <it>Rho </it>contain an exact copy of either the degenerate sequence or the high-affinity core binding sequence within 50 bp of a predicted occurrence (see the image in additional file <supplr sid="S4">4</supplr>, as well as additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr> for more IL-6 like predictions). It is interesting that in chick, where artificial IL-6 stimulation increases the number of rod photoreceptors <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, only one orthologous promoter (of all those in table <tblr tid="T1">1</tblr>) contains both the invariant core and a degenerate consensus within 50 bp of one another. The correspondence between empirical evidence and occurrences of IL-6-like motifs lends support for the biological relevance of the IL-6 prediction.</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Additional figure 2. The location of predicted RXR core binding sites (gray) and the adjacent degenerate core region (outline) in 4 cone promoters.</p>
               </text>
               <file name="1471-2105-8-407-S4.png">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The high-affinity core and a degenerate occurrence missing only the final A (i.e. CTGGA, also within 50 bp) was found in the <it>Nrl </it>promoter. This predicted site is likely to be significant considering the important role <it>Nrl </it>plays in rod photoreceptor differentiation <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. This observation suggests that one possible mechanism for IL-6 regulation of rod-differentiation involves suppression of the <it>Nrl </it>gene product.</p>
         </sec>
         <sec>
            <st>
               <p>Differences in the Core Promoter of Rod and Cone-Specific Genes</p>
            </st>
            <p>One of the most striking findings of this study is differences in the core promoter region of rod-and cone-specific genes. We predicted several rod-enriched motifs centered on the transcription start site that are similar to characterized initiator consensus sequences, but no enrichment of degenerate initiator-like sequences were found specific to cones. Conversely, a TATA-like motif was detected in almost all cone promoters near the appropriate position upstream of the transcription start site, whereas it was absent from rod promoters.</p>
            <p>Figure <figr fid="F4">4</figr> depicts 4 unique, rod-enriched motifs whose mean position lies near the transcription start with relatively low position variance. Three of these motifs are similar to portions of experimentally-validated initiators (INR-like), including the motifs in row 4 (aGGTCC) <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, row 6 (TCTGAG) <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, and row 11 (GCACAG) <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The fourth motif (ACAGTGa), in row 2, is also attributed to the initiator-like group because its antisense mismatches the accepted initiator consensus (YYANWYY) at only one position. More details of the TCTGAG motif are shown in the left panel of figure <figr fid="F8">8</figr>. We detected occurrences of this motif near the transcription start site in 5 rod promoters. The <it>Pde6a</it>, <it>Cnga1</it>, and <it>Sag </it>occurrences are on the -1 strand, and are consequently highly similar to the pyridine rich initiator consensus (YYANWYY). The proximity of the 4 motifs to the annotated transcription start site, their phylogenetic conservation, as well as similarity to portions of experimentally characterized initiators suggests that these motifs may function as degenerate initiator sequences in rod-specific promoters.</p>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Predicted rod (left) and cone (right) enriched motifs in the same format as figure 6</p>
               </caption>
               <text>
                  <p>Predicted rod (left) and cone (right) enriched motifs in the same format as figure 6. The rod motif is similar in sequence and mean position to the central portion of an initiator element, and the cone to a TATA box.</p>
               </text>
               <graphic file="1471-2105-8-407-8"/>
            </fig>
            <p>In cone promoters we found a different core promoter element, ATAA, a motif similar to the central portion of a TATA box (see Figure <figr fid="F4">4</figr> and Figure <figr fid="F7">7</figr>). One such prediction is depicted in the right panel of figure <figr fid="F8">8</figr>. Occurrences of this particular motif are found in 4 cone promoters, <it>Arr3</it>,<it>Gnat2</it>,<it>Gnb3</it>, and <it>Pde6c </it>(figure <figr fid="F8">8</figr>, right panel). These occurrences are located between 20 and 45 bp upstream of the transcription start site, close to the typical position of a TATA-box <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, supporting the idea that it is, indeed, a degenerate variation on the TATA consensus. A high degree of phylogenetic conservation of this motif and corroboration by both DME and BioProspector further support the biological relevance of this prediction.</p>
            <p>Many of the ATAA occurrences contain an additional T on the beginning of the motif, making them even closer to the classic TATA-consensus. We conducted a search for TATA-like sequences in the core promoter of rod and cone genes. It is interesting that the sequence TATAA (or its antisense) appears in 7 cone promoters (<it>Opn1mw</it>, <it>Opn1sw</it>, <it>Cngb3</it>, <it>Arr3</it>, <it>Pde6c</it>, <it>Smug1</it>, and <it>Cnga3</it>) between -180 to +60 relative to the transcription start site. Conversely, this sequence is only found in two (<it>Rho </it>and <it>Cnga1</it>) out of 11 rod promoters. The enrichment of the TATAA sequence in cones, although not as pronounced as ATAA, lends further support to the idea that the cone promoters studied here are initiated by a TATA box. It is notable that except for <it>Elovl2 </it>and <it>Pde6h</it>, an occurrence of either sATAAgw or TATAA is present near the transcription start site in all cone-specific promoters.</p>
            <p>Experimental evidence supports the biological relevance of the ATAA prediction, regardless of whether it is, indeed, a degenerate TATA-box. A recent experimental study deleted two occurrences of TATA-like motifs from the <it>Arr3 </it>promoter <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> and observed that the previously cone-specific promoter drove transgene expression in rods as well. In light of our predictions, we suggest that a TATA or TATA-like motif in the core promoter plays a central role in the differences between rod and cone expression patterns.</p>
         </sec>
         <sec>
            <st>
               <p>Limitations</p>
            </st>
            <p>The fact that the number of genes specifically expressed in either rod or cone photoreceptors is rather small makes the application of <it>de novo </it>motif discovery approaches that heavily rely on statistical analysis difficult. Because of this consideration, we took two independent approaches designed to increase the accuracy of our results. First, we employed a large number of non-photoreceptor genes as a negative control, and evaluated enrichment of motifs in either rod or cone promoters relative to this large dataset. Second, we applied 3 motif discovery software packages that use different algorithms to identify motifs. While we do not filter motifs that are identified by only one algorithm from our final database, we do provide a table of overlaps (figure <figr fid="F7">7</figr>) as additional information that can be used to evaluate predictions. Together, these two approaches should minimize both false positive and false negative errors.</p>
            <p>In the present study, we selected promoters based on ENSEMBL annotated transcription start sites. However, recent reports suggest that two separate ambiguities exist in transcription start site annotations. One, the so-called "borad" class of transcription start sites, represents inherent local variation over 50&#8211;100 bp <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B39">39</abbr></abbrgrp>. This complication should not have a major impact on the quality of our predictions. Since hierarchical clustering automatically chooses the mode and variance of a motifs' position distribution relative to the transcription start site separately for each motif, IAMMS should turn up the same predictions with some additional position variance.</p>
            <p>A second ambiguity is the recent observation that a majority of genes are driven by two or more alternative promoter sequences <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B39">39</abbr></abbrgrp>. To determine the relevance of this finding for our study, we searched the database of transcription start sites (DBTSS) for genes used in this study, and found only 3 genes (<it>Nr2e3</it>, <it>Gngt2</it>, and <it>Pde6h</it>) that contain potential alternate transcription start sites far from the ENSEMBL annotation, and 4 additional genes (<it>Pde6d</it>, <it>Pde6b</it>, <it>Gnb3</it>, and <it>Elovl2</it>) that contain alternate transcription start sites within 200 bp <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. However, in all of these cases the alternate transcripts were identified in non-retinal tissue, and therefore the alternate start sites do not pertain to our present application. In addition, 8/23 promoters that we selected for analysis are validated experimentally (<it>Arr3 </it><abbrgrp><abbr bid="B38">38</abbr><abbr bid="B41">41</abbr></abbrgrp>, <it>Pde6c</it><abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, <it>Opn1mw </it><abbrgrp><abbr bid="B24">24</abbr><abbr bid="B43">43</abbr></abbrgrp>, <it>Pde6a</it><abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, <it>Gnat2</it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, <it>Sag </it><abbrgrp><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>, <it>Rho</it><abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, and <it>Nrl</it><abbrgrp><abbr bid="B48">48</abbr></abbrgrp>). Those considerations expressed above give us confidence in the promoter regions selected for this study.</p>
            <p>This study did not detect motifs corresponding to two transcription factors known to be enriched in rod-photoreceptors compared to cones, including Mef2c <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and NR2E3 <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. One of the causes of this omission could potentially be the multiple severe constraints in our selection criteria that were introduced to maximally reduce the rate of false positive predictions. In the case of NR2E3, another potential reason may be that NR2E3 may not bind DNA directly <it>in vivo</it>. Rather, recent findings suggest that NR2E3 regulates expression indirectly by interactions with CRX <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. If there is no NR2E3 binding directly to DNA, it is not surprising that we do not identify a motif for this transcription factor.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Using a panel of three motif discovery algorithms (IAMMS, DME, and BioProspector), we predict 34 putative <it>cis</it>-elements that may be vital for maintaining either rod or cone gene expression patterns. Our predictions include many previously unknown motifs, known <it>cis</it>-elements involved in maintaining the differences between rod and cone expression patterns, as well as binding sites for transcription factors with no known photoreceptor function. Our most important predictions include specific sites for RXR and Engrailed family members (enriched in cone promoters) and IL-6 effectors (enriched in rod promoters). We predict differences in the core promoter between rod and cone phototransduction genes. While rod promoters are enriched in putative initiator-like motifs and are TATA-less, cone promoters are enriched in TATA-like motifs. To simplify access to our findings, we provide an on-line database containing detailed information about the exact position of the motifs with the respect to the transcription start and their phylogenetic conservation (additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>).</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Building photoreceptor-specific list</p>
            </st>
            <p>Genes in the photoreceptor-specific list (table <tblr tid="T1">1</tblr>) were selected as follows. Cone genes (except <it>Cnga3</it>) were selected using microarray data from NRL or NR2E3-knockout mouse retina that are known to produce a rod-deficient phenotype <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B15">15</abbr></abbrgrp>. We included <it>Cnga3 </it>which was found to be cone-specific by experimental studies <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Rod genes <it>Sag </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, <it>Pde6d </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, <it>Nrl </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, <it>Nr2e3 </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and <it>Gnb1 </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp> were previously observed to be expressed in rod but not cone photoreceptors by biochemical studies <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. The remaining rod genes (<it>Rho</it>, <it>Pde6a</it>, <it>Pde6g</it>, <it>Pde6b</it>, <it>Gnat1</it>, and <it>Cnga1</it>) were selected based on microarray data comparing FACS sorted rods to a model of cone-photoreceptors <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. The latter involves FACS sorting cells expressing GFP by the NRL promoter in NRL knockout mice, and is demonstrated to be a good model for cone photoreceptors <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. To pick the rod genes we obtained, raw CEL files for 4 normal and 4 NRL knockout animals were obtained using the Gene Expression Omnibus website. The data were MAS5 normalized and averaged using the bioconductor package <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. We selected genes involved in the phototransduction pathway that were significantly down-regulated in the <it>Nrl </it>knockout samples (p &lt; 0.02; Student's t-test).</p>
            <p>For each gene in table <tblr tid="T1">1</tblr>, 2 kb of sequence upstream of the annotated transcription start site and the entire 5' UTR of the mouse was obtained from ENSEMBL (Mouse v.36, Aug. 2005). Two genes (<it>Nrl </it>and <it>Gngt2</it>) contained two annotated transcription start sites within 1000 bp of each other, and each promoter contained a UTR. In both cases, the promoter closer to the translation start site was chosen. This choice effectively included the region immediately upstream of each transcription start site; for <it>Nrl</it>, this choice corresponded to an experimental study <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Selecting background promoter set</p>
            </st>
            <p>We constructed a background sequence set from genes that are not expressed in either rods or cones, but are expressed in most tissues, in a tissue independent manner. To construct this background set, we first identified all genes that are not expressed in adult rod or cone photoreceptors. To do this, we used the MAS5 normalized FACS sorted microarray data obtained in the previous section. From this data, we obtained a list of REFSEQ IDs where all probes associated with each REFSEQ ID was marked absent by the Affymetrix perfect match/mismatch designation.</p>
            <p>To evaluate tissue specificity, microarray data from the mouse gene-atlas <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> was obtained from NCBI's Gene Expression Omnibus (GSE1133) using R's Bioconductor plugin <abbrgrp><abbr bid="B52">52</abbr></abbrgrp> and GEOquery <abbrgrp><abbr bid="B54">54</abbr></abbrgrp> packages. Average expression of each gene in each tissue was calculated, and probe sets were converted to REFSEQ IDs. Next, we calculated the ratio of maximum expression to the sum of expression in all tissues. This tissue-specificity ratio ranged between 0.02 (nearly equal expression between all tissues) and 0.98 (highly specific to one tissue). For the background set, we selected all REFSEQ IDs for genes with a ratio less than 0.03 that are also absent from both adult rod and cone photoreceptors (n= 84). For all of these genes, 2 kb of upstream sequence and the entire 5' UTR was obtained using ENSEMBL's BioMart.</p>
         </sec>
         <sec>
            <st>
               <p>IAMMS procedure</p>
            </st>
            <p>The flowchart of IAMMS is depicted in figure <figr fid="F1">1</figr>. The input for the algorithm consists of 3 sets of promoter sequences, including 11 rod-specific sequences, 12 cone-specific sequences, and an additional set of background sequences that do not drive expression in photoreceptors. All promoters were passed through an iterative alignment procedure (step 1) that returns a motif for each sequence &#8805; 8 bp in length that appears more than once in photoreceptor promoters. The resulting database of potential motifs was scanned for sequences similar to a known simple or interspersed repeat sequence (step 2). Motifs were evaluated for cell-specificity using a binomial model of enrichment (statistical annotation, step 3) to create predictions for cell-specific motifs. Ambiguous motifs were filtered to extract sets of position-enriched occurrences using an agglomerative clustering procedure (step 4). Position-enriched clusters were subsequently analyzed using statistical annotation (step 5) to create a set of position-specific predictions. Both position-enriched and non-enriched predictions were subsequently analyzed by phylogenetic analysis (step 6) and were compared to known <it>cis</it>-elements (step 7).</p>
         </sec>
         <sec>
            <st>
               <p>Step 1: Iterative alignment</p>
            </st>
            <p>All sequences &#8805; 8 bp in length that appear more than once in rod and cone photoreceptor promoters were identified using the BLAST implementation distributed by Washington University <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. We used scoring parameters that were observed to return short, nearly exact matches: +2 for a match, -3 for a mismatch, and a threshold bit score of 16. Gaps were allowed, but using the default score of -20 gaps rarely appeared (impossible in any sequence pair less than 28 bp match of flanking surrounding a gap). After completion, we separated pairwise alignments into a database of individual sequences. We filtered this database, so that each sequence occurs exactly once. This database contains each sequence &#8805; 8 bp in length that occurs at least twice in the photoreceptor promoter set. Next, we constructed a multiple alignment for each sequence returned in the pairwise alignment. In the second iteration, we scanned all promoter sequences (rod, cone, and 84 background promoters) using each sequence identified in the previous database. BLAST was run using the same match and mismatch parameters as the first iteration, but the threshold bit score was changed for each sequence. To calculate the threshold bit score, we multiplied the sequence length by 1.4 (we also tried a variety of constants between 1.3 and 1.7). The output of this step is a series of multiple alignments &#8211; one alignment for each sequence occurring twice in photoreceptor promoters.</p>
         </sec>
         <sec>
            <st>
               <p>Step 2: Masking Longer Sequences to RepBase</p>
            </st>
            <p>Longer sequences were evaluated to examine potential similarities to known repeats. BLAST was used to compare each sequence &#8805; 20 bp in length to all mouse repeats represented in RepBase <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> v.11.07. A scoring scheme of +2 (match), -3 (mismatch), and 20 (threshold) was used.</p>
         </sec>
         <sec>
            <st>
               <p>Step 3: Statistical Annotation &amp; Bonferroni Correction</p>
            </st>
            <p>Let <it>k</it><sub><it>r </it></sub>and <it>k</it><sub><it>c </it></sub>be the number of rod- and cone-specific promoters that contain at least one occurrence in a motif with <it>n </it>occurrences. We take the p-value of cell-specific enrichment to be the binomial probability of selecting a list that contains <it>n </it>sequences, of which <it>k</it><sub><it>r </it></sub>or <it>k</it><sub><it>c </it></sub>are mapped to rod- or cone-specific promoters. The probability of selecting one rod/cone-specific promoter is the number of rod or cone promoters divided by the total number of promoters (11/107 = 0.103 for rods).</p>
            <p>Due to overlap between different motif core regions, we observed underdispersion relative to the binomial model described above. We corrected the p-value by Z-score normalizing across all groups with a given number of occurrences using an empirically derived mean and standard deviation. To perform the Bonferroni correction, we multiplied the corrected p-value by the total number of sequences considered for cell-specific expression, not counting motifs with less than 4 occurrences in photoreceptor promoters or similar to repeat sequences longer than 20 bp (38,779). For the sake of simplicity our calculations do not take into account dependence between motifs with highly similar core regions, and are therefore highly conservative. This Bonferroni corrected expected false positive rate is referred to as the E-value. We select all motifs with an enrichment E-value less than 2.5 in one photoreceptor cell type, and also require that no occurrences are found in the alternate photoreceptor cell type.</p>
         </sec>
         <sec>
            <st>
               <p>Step 4: Agglomerative Hierarchical Clustering</p>
            </st>
            <p>Single link hierarchical agglomerative clustering was performed on the distance to the transcription start site for each motif. Motifs are broken into the minimum number of clusters when the mean inter-cluster variance reaches 1% of the total variance. To ensure that the data contains an underlying structure that can be described by clusters, only motifs with an agglomerative coefficient greater than or equal to 0.7 were analyzed using agglomerative clustering. All computations were performed using the R-cluster package <abbrgrp><abbr bid="B57">57</abbr></abbrgrp> (v.2.3.0). Samples of clusters selected by this procedure are shown in figure <figr fid="F5">5A</figr>. In addition to single link, Euclidean and Ward's algorithms in the R-cluster package were tested, though all results are reported on single-link results.</p>
         </sec>
         <sec>
            <st>
               <p>Steps 5: Statistical Annotation</p>
            </st>
            <p>Each position enriched cluster with occurrences in the first 400 bp upstream of the transcription start site was analyzed for rod/cone specificity using the statistical annotation procedure described in step 3.</p>
         </sec>
         <sec>
            <st>
               <p>Step 6: Phylogenetic Analysis</p>
            </st>
            <p>To compare the conservation of a putative mouse <it>cis</it>-element to other predictions, we applied a recently described model <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> involving a comparison between the actual number of substitutions in the predicted <it>cis</it>-element to the expected rate of conservation between all species in the alignment. First, alignments corresponding to promoters of interest were extracted from existing whole-genome alignments using the UCSC genome browser <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> (mm8 version). Raw alignments corresponding to each mouse promoter were obtained as different regions known as alignment blocks. We defined the cross-species conservation score (CSCS) as the Z-score of the calculated substitutions in our sequence of interest, relative to all surrounding windows (the same size as our comparison sequence) in the same local alignment block. To calculate the mean and standard deviation, we used a sliding window (the same size as the prediction) locally, in the alignment block. Negative results mean less than the average number of mutations are found in the window (i.e. the sequence is conserved); positive values mean that there are more differences. Sequences corresponding to gaps where no alignment is available between the mouse sequence and other species were not included in the analysis. References to these scores were left blank in the figures and additional data files.</p>
         </sec>
         <sec>
            <st>
               <p>Step 7: Comparison to known motifs</p>
            </st>
            <p>Rod/cone-specific motifs were compared to a database of known motif and consensus sequences using the <ul>T</ul>ranscriptional <ul>E</ul>lement <ul>S</ul>earch <ul>S</ul>ystem <abbrgrp><abbr bid="B60">60</abbr></abbrgrp> (TESS). Searches for each list were performed on the sequence returned in the first BLAST iteration using the TESS combined search option. Log-likelihood score filtering was used for both string and weight matrix functions. The minimum log-likelihood ratio was set to 12 for string scoring and 10 for weight matrix scoring. Matches were visually inspected, and deemed to be similar only if nearly identical in the invariant core region. TESS missed several regulatory sequences available in the experimental literature, many of which have special relevance to a photoreceptor system. These similarities were annotated by hand and are included in figures <figr fid="F4">4</figr> and <figr fid="F7">7</figr>.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison between predictions</p>
            </st>
            <p>Predictions highly similar in the core region, but differing in ambiguous peripheral positions, (for example, see figure <figr fid="F7">7</figr>, rows 8&#8211;10 and row 7 antisense) were grouped prior to the creation of figure <figr fid="F4">4</figr> and counting predictions for the text. Sequences were grouped according to the methods described in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. A single vector was created by concatenating column vectors from the position-weight-matrix representation of a motif. The maximum Pearson coefficient over each possible alignment between two separate motifs (including sense-antisense) was subsequently calculated. When comparing motifs with a different length, the smaller motif was compared against the larger to emphasize similarities in the invariant core region. During this step, overhangs were filled in with P(A) = P(T) = P(C) = P(G) = 0.25. Motifs were considered highly-similar if the Pearson coefficient was greater than 0.85.</p>
         </sec>
         <sec>
            <st>
               <p>DME/BioProsprector procedure</p>
            </st>
            <p>The same promoters used for IAMMS were repeat masked <abbrgrp><abbr bid="B61">61</abbr></abbrgrp> using the following settings: wublast algorithm, DNA source set to mouse, and default sensitivity. After masking, sequences were chopped to include only 500 bp of upstream sequence and 100 bp of UTR, when available. Each software program was run with default settings, except that the number of motifs to return was set to 10, and the motif size was varied between 6 and 10 (for DME), or 6 and 12 (for BioProspector). When identifying cone-specific motifs, cone promoters were compared against the combined set of rod-specific and background promoters (and vice-versa). Results for each run were added into the same table and sorted by the score given by each program. Output motifs were compared to predictions made by IAMMS using the method described in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, and repeated above (see <it>Comparison between predictions </it>in Methods section).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>List of abbreviations</p>
         </st>
         <p>CSCS: Cross Species Conservation Score</p>
         <p>DBTSS: Database of Transcription Start Sites</p>
         <p>FACS: Fluorescent Activated Cell Sorting</p>
         <p>IAMMS: Iterative Alignment/Modular Motif Selection</p>
         <p>RXR: Retnoid X Receptor</p>
         <p>TESS: Transcriptional Element Search System</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>CGD assisted in the design and implementation of IAMMS, applied IAMMS to photoreceptor promoter regions, conducted analysis of results based on experimental work, conducted microarray analysis, and wrote the paper. VAM constructed the initial photoreceptor-specific list and assisted in analysis of the results. MQ assisted in the implementation of IAMMS and some web-based tools to distribute predictions. BEK assisted in the construction of the photoreceptor-specific list and in the analysis of the results. AMP assisted in the design of IAMMS, contributed enormously to writing the manuscript, provided thoughtful discussion. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Thanks to Mike Zuber, Andrea Viczian, and Rebecca Smith for reading this manuscript and providing valuable comments. Work funded by NIH, grant number 5R01HL07163504, and the National Eye Institute at NIH, grant numbers EY11256, EY12975 and EY016644 Research to Prevent Blindness (Unrestricted Grant to SUNY UMU Department of Opthalmology) and Lions of CNY.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Identification of regulatory targets of tissue-specific transcription factors: application to retina-specific gene regulation</p>
            </title>
            <aug>
               <au>
                  <snm>Qian</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Esumi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Chowers</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Zack</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>3479</fpage>
            <lpage>91</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1153713</pubid>
                  <pubid idtype="pmpid" link="fulltext">15967807</pubid>
                  <pubid idtype="doi">10.1093/nar/gki658</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Expression profiling of the developing and mature Nrl-/- mouse retina: identification of retinal disease candidates and transcriptional regulatory targets of Nrl</p>
            </title>
            <aug>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mears</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Carter</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Jing</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Farjo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fleury</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Barlow</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hero</snm>
                  <fnm>AO</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2004</pubdate>
            <volume>13</volume>
            <fpage>1487</fpage>
            <lpage>503</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddh160</pubid>
                  <pubid idtype="pmpid" link="fulltext">15163632</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover</p>
            </title>
            <aug>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>1114</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12082130</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Large-scale turnover of functional transcription factor binding sites in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Moses</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Pollard</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Nix</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>XY</fnm>
               </au>
               <au>
                  <snm>Biggin</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <fpage>e130</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1599766</pubid>
                  <pubid idtype="pmpid" link="fulltext">17040121</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0020130</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Evolution of alternative transcriptional circuits with identical logic</p>
            </title>
            <aug>
               <au>
                  <snm>Tsong</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Tuch</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>443</volume>
            <fpage>415</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05099</pubid>
                  <pubid idtype="pmpid" link="fulltext">17006507</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Detecting the limits of regulatory element conservation and divergence estimation using pairwise and multiple alignments</p>
            </title>
            <aug>
               <au>
                  <snm>Pollard</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Moses</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>376</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1613255</pubid>
                  <pubid idtype="pmpid" link="fulltext">16904011</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-376</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Identifying tissue-selective transcription factor binding sites in vertebrate promoters</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Sumazin</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>1560</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">547828</pubid>
                  <pubid idtype="pmpid" link="fulltext">15668401</pubid>
                  <pubid idtype="doi">10.1073/pnas.0406123102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2001</pubdate>
            <fpage>127</fpage>
            <lpage>38</lpage>
            <xrefbib>
               <pubid idtype="pmpid">11262934</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Comparative promoter analysis allows de novo identification of specialized cell junction-associated proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Cohen</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Klingenhoff</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Boucherot</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nitsche</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Henger</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Brunner</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schmid</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Merkle</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Saleem</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>KP</fnm>
               </au>
               <au>
                  <snm>Werner</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Grone</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Kretzler</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>5682</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1421338</pubid>
                  <pubid idtype="pmpid" link="fulltext">16581909</pubid>
                  <pubid idtype="doi">10.1073/pnas.0511257103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Integrating regulatory motif discovery and genome-wide expression analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Conlon</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>XS</fnm>
               </au>
               <au>
                  <snm>Lieb</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>3339</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">152294</pubid>
                  <pubid idtype="pmpid" link="fulltext">12626739</pubid>
                  <pubid idtype="doi">10.1073/pnas.0630591100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>XS</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>20</volume>
            <fpage>835</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12101404</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A survey of motif discovery methods in an integrated framework</p>
            </title>
            <aug>
               <au>
                  <snm>Sandve</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Drablos</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Biol Direct</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <fpage>11</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1479319</pubid>
                  <pubid idtype="pmpid" link="fulltext">16600018</pubid>
                  <pubid idtype="doi">10.1186/1745-6150-1-11</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>In silico representation and discovery of transcription factor binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Pavesi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mauri</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pesole</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>217</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/5.3.217</pubid>
                  <pubid idtype="pmpid" link="fulltext">15383209</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kulbokas</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Mootha</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>434</volume>
            <fpage>338</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature03441</pubid>
                  <pubid idtype="pmpid" link="fulltext">15735639</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A hybrid photoreceptor expressing both rod and cone genes in a mouse model of enhanced S-cone syndrome</p>
            </title>
            <aug>
               <au>
                  <snm>Corbo</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Cepko</snm>
                  <fnm>CL</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <fpage>e11</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1186732</pubid>
                  <pubid idtype="pmpid" link="fulltext">16110338</pubid>
                  <pubid idtype="doi">10.1371/journal.pgen.0010011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Selective loss of cone function in mice lacking the cyclic nucleotide-gated channel CNG3</p>
            </title>
            <aug>
               <au>
                  <snm>Biel</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Seeliger</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pfeifer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kohler</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Gerstner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ludwig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jaissle</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fauser</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zrenner</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>7553</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">22124</pubid>
                  <pubid idtype="pmpid" link="fulltext">10377453</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.13.7553</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Expression of S-antigen in retina, pineal gland, lens, and brain is directed by 5'-flanking sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Breitman</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Tsuda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Usukura</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kikuchi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Zucconi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Khoo</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Shinohara</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1991</pubdate>
            <volume>266</volume>
            <fpage>15505</fpage>
            <lpage>10</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1714458</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Multiple phosphorylated isoforms of NRL are expressed in rod photoreceptors</p>
            </title>
            <aug>
               <au>
                  <snm>Swain</snm>
                  <fnm>PK</fnm>
               </au>
               <au>
                  <snm>Hicks</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mears</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Apel</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>John</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Hendrickson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Milam</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>36824</fpage>
            <lpage>30</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M105855200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11477108</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Expression of photoreceptor-specific nuclear receptor NR2E3 in rod photoreceptors of fetal human retina</p>
            </title>
            <aug>
               <au>
                  <snm>Bumsted O'Brien</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Schulte</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hendrickson</snm>
                  <fnm>AE</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>2004</pubdate>
            <volume>45</volume>
            <fpage>2807</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1167/iovs.03-1317</pubid>
                  <pubid idtype="pmpid" link="fulltext">15277507</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Solubilization of membrane-bound rod phosphodiesterase by the rod phosphodiesterase recombinant delta subunit</p>
            </title>
            <aug>
               <au>
                  <snm>Florio</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Prusti</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Beavo</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1996</pubdate>
            <volume>271</volume>
            <fpage>24036</fpage>
            <lpage>47</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.271.39.24036</pubid>
                  <pubid idtype="pmpid" link="fulltext">8798640</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Transmembrane S1 mutations in CNGA3 from achromatopsia 2 patients cause loss of function and impaired cellular trafficking of the cone CNG channel</p>
            </title>
            <aug>
               <au>
                  <snm>Patel</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Bartoli</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Fandino</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Ngatchou</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Woch</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Carey</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>2005</pubdate>
            <volume>46</volume>
            <fpage>2282</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1167/iovs.05-0179</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980212</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Targeting of GFP to newborn rods by Nrl promoter and temporal expression profiling of flow-sorted photoreceptors</p>
            </title>
            <aug>
               <au>
                  <snm>Akimoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brzezinski</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Khanna</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Filippova</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Jing</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Linares</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zareparsi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mears</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Hero</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Glaser</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>3890</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1383502</pubid>
                  <pubid idtype="pmpid" link="fulltext">16505381</pubid>
                  <pubid idtype="doi">10.1073/pnas.0508214103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Xenopus rhodopsin promoter. Identification of immediate upstream sequences necessary for high level, rod-specific transcription</p>
            </title>
            <aug>
               <au>
                  <snm>Mani</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Batni</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Whitaker</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Engbretson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>BE</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>36557</fpage>
            <lpage>65</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M101685200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11333267</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Conserved cis-elements in the Xenopus red opsin promoter necessary for cone-specific expression</p>
            </title>
            <aug>
               <au>
                  <snm>Babu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>McIlvain</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Whitaker</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>BE</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2006</pubdate>
            <volume>580</volume>
            <fpage>1479</fpage>
            <lpage>84</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.febslet.2006.01.080</pubid>
                  <pubid idtype="pmpid" link="fulltext">16466721</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>A thyroid hormone receptor that is required for the development of green cone photoreceptors</p>
            </title>
            <aug>
               <au>
                  <snm>Ng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hurley</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Dierks</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Srinivas</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Salto</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Vennstrom</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Reh</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Forrest</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2001</pubdate>
            <volume>27</volume>
            <fpage>94</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11138006</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Making the gradient: thyroid hormone regulates cone opsin expression in the developing mouse retina</p>
            </title>
            <aug>
               <au>
                  <snm>Roberts</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Srinivas</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Forrest</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Morrealde Escobar</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Reh</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>6218</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1458858</pubid>
                  <pubid idtype="pmpid" link="fulltext">16606843</pubid>
                  <pubid idtype="doi">10.1073/pnas.0509981103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Identification of deoxyribonucleic acid sequences that bind retinoid-X receptor-gamma with high affinity</p>
            </title>
            <aug>
               <au>
                  <snm>Dowhan</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Downes</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sturm</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Muscat</snm>
                  <fnm>GE</fnm>
               </au>
            </aug>
            <source>Endocrinology</source>
            <pubdate>1994</pubdate>
            <volume>135</volume>
            <fpage>2595</fpage>
            <lpage>607</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1210/en.135.6.2595</pubid>
                  <pubid idtype="pmpid" link="fulltext">7988448</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The DNA binding specificity of engrailed homeodomain</p>
            </title>
            <aug>
               <au>
                  <snm>Draganescu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tullius</snm>
                  <fnm>TD</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>276</volume>
            <fpage>529</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1997.1567</pubid>
                  <pubid idtype="pmpid" link="fulltext">9551094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Crystal structure of an engrailed homeodomain-DNA complex at 2.8 A resolution: a framework for understanding homeodomain-DNA interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Kissinger</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Martin-Blanco</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kornberg</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1990</pubdate>
            <volume>63</volume>
            <fpage>579</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(90)90453-L</pubid>
                  <pubid idtype="pmpid">1977522</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Postmitotic cells fated to become rod photoreceptors can be respecified by CNTF treatment of the retina</p>
            </title>
            <aug>
               <au>
                  <snm>Ezzeddine</snm>
                  <fnm>ZD</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>DeChiara</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yancopoulos</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cepko</snm>
                  <fnm>CL</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>1997</pubdate>
            <volume>124</volume>
            <fpage>1055</fpage>
            <lpage>67</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9056780</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Ciliary neurotrophic factor as a transient negative regulator of rod development in rat retina</p>
            </title>
            <aug>
               <au>
                  <snm>Schulz-Key</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Beisenherz-Huss</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Barbisch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kirsch</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>2002</pubdate>
            <volume>43</volume>
            <fpage>3099</fpage>
            <lpage>108</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12202535</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Nuclear factors interacting with an interleukin-6 responsive element of rat alpha 2-macroglobulin gene</p>
            </title>
            <aug>
               <au>
                  <snm>Ito</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tanahashi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Misumi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sakaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1989</pubdate>
            <volume>17</volume>
            <fpage>9425</fpage>
            <lpage>35</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">335143</pubid>
                  <pubid idtype="pmpid" link="fulltext">2479916</pubid>
                  <pubid idtype="doi">10.1093/nar/17.22.9425</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>CNTF exerts opposite effects on in vitro development of rat and chick photoreceptors</p>
            </title>
            <aug>
               <au>
                  <snm>Kirsch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fuhrmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wiese</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>HD</fnm>
               </au>
            </aug>
            <source>Neuroreport</source>
            <pubdate>1996</pubdate>
            <volume>7</volume>
            <fpage>697</fpage>
            <lpage>700</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1097/00001756-199602290-00004</pubid>
                  <pubid idtype="pmpid">8733724</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>An initiator element mediates autologous downregulation of the human type A gamma -aminobutyric acid receptor beta 1 subunit gene</p>
            </title>
            <aug>
               <au>
                  <snm>Russek</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Bandyopadhyay</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Farb</snm>
                  <fnm>DH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>8600</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">26994</pubid>
                  <pubid idtype="pmpid" link="fulltext">10900018</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.15.8600</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>The mouse p97 (CDC48) gene. Genomic structure, definition of transcriptional regulatory sequences, gene expression, and characterization of a pseudogene</p>
            </title>
            <aug>
               <au>
                  <snm>Muller</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Ruhrberg</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Stamp</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Warren</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shima</snm>
                  <fnm>DT</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1999</pubdate>
            <volume>274</volume>
            <fpage>10154</fpage>
            <lpage>62</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.274.15.10154</pubid>
                  <pubid idtype="pmpid" link="fulltext">10187799</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Regulation of the interleukin-7 receptor alpha promoter by the Ets transcription factors PU.1 and GA-binding protein in developing B cells</p>
            </title>
            <aug>
               <au>
                  <snm>DeKoter</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Schweitzer</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Kamath</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Tagoh</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bonifer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hildeman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>KJ</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2007</pubdate>
            <volume>282</volume>
            <fpage>14194</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M700377200</pubid>
                  <pubid idtype="pmpid" link="fulltext">17392277</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Genome-wide analysis of mammalian promoter architecture and evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Katayama</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shimokawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ponjavic</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Semple</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Engstrom</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Forrest</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Alkema</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Plessy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kodzius</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ravasi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kasukawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fukuda</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanamori-Katayama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kitazume</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kawaji</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kai</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Konno</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nakano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mottagui-Tabar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Arner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chesi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gustincich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Persichetti</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Grimmond</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Wells</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Orlando</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Wahlestedt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Harbers</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <fpage>626</fpage>
            <lpage>35</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1789</pubid>
                  <pubid idtype="pmpid" link="fulltext">16645617</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Deciphering the contribution of known cis-elements in the mouse cone arrestin gene to its cone-specific expression</p>
            </title>
            <aug>
               <au>
                  <snm>Pickrell</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Craft</snm>
                  <fnm>CM</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>2004</pubdate>
            <volume>45</volume>
            <fpage>3877</fpage>
            <lpage>84</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1167/iovs.04-0663</pubid>
                  <pubid idtype="pmpid" link="fulltext">15505032</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Mapping of transcription start sites of human retina expressed genes</p>
            </title>
            <aug>
               <au>
                  <snm>Roni</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Carpio</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wissinger</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>42</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1802077</pubid>
                  <pubid idtype="pmpid" link="fulltext">17286855</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-8-42</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs</p>
            </title>
            <aug>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sugano</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>328</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99097</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752328</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.328</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Mouse cone arrestin gene characterization: promoter targets expression to cone photoreceptors</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Babu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Murage</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Craft</snm>
                  <fnm>CM</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2002</pubdate>
            <volume>524</volume>
            <fpage>116</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(02)03014-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">12135752</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Conserved transcriptional regulation of a cone phototransduction gene in vertebrates</p>
            </title>
            <aug>
               <au>
                  <snm>Viczian</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Verardo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zuber</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Farber</snm>
                  <fnm>DB</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2004</pubdate>
            <volume>577</volume>
            <fpage>259</fpage>
            <lpage>64</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.febslet.2004.10.008</pubid>
                  <pubid idtype="pmpid" link="fulltext">15527796</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Functional analysis of the promoters of the human red and green visual pigment genes</p>
            </title>
            <aug>
               <au>
                  <snm>Shaaban</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Deeb</snm>
                  <fnm>SS</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>1998</pubdate>
            <volume>39</volume>
            <fpage>885</fpage>
            <lpage>96</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9579468</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Functional analysis of the rod photoreceptor cGMP phosphodiesterase alpha-subunit gene promoter: Nrl and Crx are required for full transcriptional activity</p>
            </title>
            <aug>
               <au>
                  <snm>Pittler</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mears</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Zack</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Swain</snm>
                  <fnm>PK</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>JB</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2004</pubdate>
            <volume>279</volume>
            <fpage>19800</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M401864200</pubid>
                  <pubid idtype="pmpid" link="fulltext">15001570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Localization of upstream silencer elements involved in the expression of cone transducin alpha-subunit (GNAT2)</p>
            </title>
            <aug>
               <au>
                  <snm>Morris</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Fong</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fong</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Invest Ophthalmol Vis Sci</source>
            <pubdate>1997</pubdate>
            <volume>38</volume>
            <fpage>196</fpage>
            <lpage>206</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9008644</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The proximal promoter of the mouse arrestin gene directs gene expression in photoreceptor cells and contains an evolutionarily conserved retinal factor-binding site</p>
            </title>
            <aug>
               <au>
                  <snm>Kikuchi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Raju</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Breitman</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Shinohara</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1993</pubdate>
            <volume>13</volume>
            <fpage>4400</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">360006</pubid>
                  <pubid idtype="pmpid" link="fulltext">8321239</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Immediate upstream sequence of arrestin directs rod-specific expression in Xenopus</p>
            </title>
            <aug>
               <au>
                  <snm>Mani</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Besharse</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>BE</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1999</pubdate>
            <volume>274</volume>
            <fpage>15590</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.274.22.15590</pubid>
                  <pubid idtype="pmpid" link="fulltext">10336455</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Retinoic acid regulates the expression of photoreceptor transcription factor NRL</p>
            </title>
            <aug>
               <au>
                  <snm>Khanna</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Akimoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Siffroi-Fernandez</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Hicks</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2006</pubdate>
            <volume>281</volume>
            <fpage>27327</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1592579</pubid>
                  <pubid idtype="pmpid" link="fulltext">16854989</pubid>
                  <pubid idtype="doi">10.1074/jbc.M605500200</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>The rod photoreceptor-specific nuclear receptor Nr2e3 represses transcription of multiple cone-specific genes</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rattner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nathans</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Neurosci</source>
            <pubdate>2005</pubdate>
            <volume>25</volume>
            <fpage>118</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1523/JNEUROSCI.3571-04.2005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15634773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>The photoreceptor-specific nuclear receptor Nr2e3 interacts with Crx and exerts opposing effects on the transcription of rod versus cone genes</p>
            </title>
            <aug>
               <au>
                  <snm>Peng</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Ahmad</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ahmad</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>747</fpage>
            <lpage>64</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddi070</pubid>
                  <pubid idtype="pmpid" link="fulltext">15689355</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Photoreceptors of Nrl -/- mice coexpress functional S- and M-cone opsins having distinct inactivation mechanisms</p>
            </title>
            <aug>
               <au>
                  <snm>Nikonov</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Daniele</snm>
                  <fnm>LL</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Craft</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Swaroop</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pugh</snm>
                  <fnm>EN</fnm>
                  <suf>Jr</suf>
               </au>
            </aug>
            <source>J Gen Physiol</source>
            <pubdate>2005</pubdate>
            <volume>125</volume>
            <fpage>287</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1085/jgp.200409208</pubid>
                  <pubid idtype="pmpid" link="fulltext">15738050</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Bioconductor: open software development for computational biology and bioinformatics</p>
            </title>
            <aug>
               <au>
                  <snm>Gentleman</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Carey</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dettling</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gentry</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hornik</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hothorn</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Iacus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Leisch</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Maechler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rossini</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Sawitzki</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Smyth</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tierney</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>JY</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>R80</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545600</pubid>
                  <pubid idtype="pmpid" link="fulltext">15461798</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-10-r80</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>A gene atlas of the mouse and human protein-encoding transcriptomes</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Wiltshire</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Batalov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lapp</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Soden</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hayakawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kreiman</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cooke</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Hogenesch</snm>
                  <fnm>JB</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <fpage>6062</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395923</pubid>
                  <pubid idtype="pmpid" link="fulltext">15075390</pubid>
                  <pubid idtype="doi">10.1073/pnas.0400782101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor</p>
            </title>
            <aug>
               <au>
                  <snm>Davis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Meltzer</snm>
                  <fnm>PS</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
         </bibl>
         <bibl id="B55">
            <title>
               <p>BLASTN 2.0 MP-WashU [10-May-2005] [linux26-i786-ILP32F64 2005-05-10T21:12:31]. (1996&#8211;2004)</p>
            </title>
            <aug>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <note>(Abstract)</note>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Repbase update: a database and an electronic journal of repetitive elements</p>
            </title>
            <aug>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>418</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02093-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">10973072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>R: A Language for Data Analysis and Graphics</p>
            </title>
            <aug>
               <au>
                  <snm>Ihaka</snm>
                  <fnm>Ross</fnm>
               </au>
               <au>
                  <snm>Gentleman</snm>
                  <fnm>Robert</fnm>
               </au>
            </aug>
            <source>Journal of Computational and Graphical Statistics</source>
            <pubdate>1996</pubdate>
            <fpage>299</fpage>
            <lpage>314</lpage>
            <note>(Abstract)</note>
            <xrefbib>
               <pubid idtype="doi">10.2307/1390807</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>A model of the statistical power of comparative genome sequence analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <fpage>e10</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539325</pubid>
                  <pubid idtype="pmpid" link="fulltext">15660152</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0030010</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>The human genome browser at UCSC</p>
            </title>
            <aug>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Pringle</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Zahler</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>996</fpage>
            <lpage>1006</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186604</pubid>
                  <pubid idtype="pmpid" link="fulltext">12045153</pubid>
                  <pubid idtype="doi">10.1101/gr.229102. Article published online before print in May 2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>TESS: Transcription Element Search System</p>
            </title>
            <url>http://www.cbil.upenn.edu/cgi-bin/tess/tess</url>
         </bibl>
         <bibl id="B61">
            <title>
               <p>RepeatMasker Open-3.0. (1996&#8211;2004.)</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>AHR</fnm>
               </au>
               <au>
                  <snm>G</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <url>http://www.repeatmasker.org/</url>
         </bibl>
      </refgrp>
   </bm>
</art>
