<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
<ui>1741-7007-8-49</ui>
<ji>1741-7007</ji>
<fm>
<dochead>Software</dochead>
<bibl>
<title><p>MochiView: versatile software for genome browsing and DNA motif analysis</p></title>
<aug><au ca="yes" id="A1"><snm>Homann</snm><mi>R</mi><fnm>Oliver</fnm><insr iid="I1"/><email>oliver.homann@ucsf.edu</email></au>
<au id="A2"><snm>Johnson</snm><mi>D</mi><fnm>Alexander</fnm><insr iid="I1"/><insr iid="I2"/><email>ajohnson@cgl.ucsf.edu</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Microbiology and Immunology, University of California San Francisco, San Francisco, California, USA</p></ins>
<ins id="I2"><p>Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, USA</p></ins>
</insg>
<source>BMC Biology</source>
<issn>1741-7007</issn>
<pubdate>2010</pubdate>
<volume>8</volume>
<issue>1</issue>
<fpage>49</fpage>
<url>http://www.biomedcentral.com/1741-7007/8/49</url>
<xrefbib><pubidlist><pubid idtype="pmpid">20409324</pubid><pubid idtype="doi">10.1186/1741-7007-8-49</pubid></pubidlist></xrefbib></bibl>
<history><rec><date><day>2</day><month>3</month><year>2010</year></date></rec><acc><date><day>21</day><month>4</month><year>2010</year></date></acc><pub><date><day>21</day><month>4</month><year>2010</year></date></pub></history><cpyrt><year>2010</year><collab>Homann and Johnson; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec><st><p>Abstract</p></st>
<sec><st><p>Background</p></st>
<p>As high-throughput technologies rapidly generate genome-scale data, it becomes increasingly important to visually integrate these data so that specific hypotheses can be formulated and tested.</p>
</sec>
<sec><st><p>Results</p></st>
<p>We present MochiView, a platform-independent Java software that integrates browsing of genomic sequences, features, and data with DNA motif visualization and analysis in a visually-appealing and user-friendly application.</p>
</sec>
<sec><st><p>Conclusions</p></st>
<p>While highly versatile, the software is particularly useful for organizing, exploring, and analyzing large genomic data sets, such as those from deep RNA sequencing, chromatin immunoprecipitation experiments (ChIP-Seq and ChIP-Chip), and transcriptional profiling. MochiView provides an extensive suite of utilities to identify and to explore connections between these data sets and short sequence motifs present in DNA or RNA.</p>
</sec>
</sec>
</abs>
</fm>
<meta>
<classifications>
<classification id="endnote" subtype="user_supplied_xml" type="bmc"/>
</classifications>
</meta>
<bdy>
<sec><st><p>Background</p></st>
<p>We describe a versatile tool for visualizing and exploring large genomic data sets, particularly those generated by <b><it>ch</it></b>romatin <b><it>i</it></b>mmuno<b><it>p</it></b>recipitation (ChIP). This technique is often used to identify regions of a genome that are bound by a specific transcription factor under a given set of conditions. For those transcription factors that recognize DNA directly, it is often possible, from ChIP data alone, to deduce the range of DNA sequences (the motif) that a given transcription factor recognizes. ChIP relies on cross-linking transcription factors to DNA in living cells, shearing and isolating the DNA, and recovering the DNA cross-linked to a specific transcription factor. The recovered DNA is then analyzed using either tiling microarrays (ChIP-Chip) or sequencing (ChIP-Seq). Both approaches generate a nearly continuous profile of binding enrichment across the genome, with high-density tiling for ChIP-Chip currently being feasible only for smaller genomes, such as those from bacteria or fungi. Several existing genome browsers aid in the visualization and analysis of such data, but few contain tools to easily integrate motif data into the analysis. MochiView (<b><it>Mo</it></b>tif and <b><it>ChI</it></b>P <b><it>View</it></b>er) is designed to bridge this gap, providing a highly flexible and intuitive interface that allows one to easily import, visualize, explore, and analyze large sets of data, such as those generated from ChIP experiments.</p>
</sec>
<sec><st><p>Implementation</p></st>
<p>MochiView is written in Java, and can be used with any operating system that supports Java version 1.6 or higher. To facilitate smooth genome browsing (by caching data) and the import of large files, MochiView requires hardware with a minimum of 1 GB memory. Many genome browsers introduce an extra layer of complexity by requiring the user to install an external database or to store data on a remotely hosted server. MochiView circumvents this problem by transparently incorporating the Java DB database within the software (specific features of the MochiView software design are described in the MochiView manual). The database architecture is designed to scale well even with very large quantities of data; database size is primarily constrained by available hard drive space. In practice, database sizes can range from a few megabytes to many gigabytes in size, depending on genome size and the quantity of data. MochiView can maintain multiple databases, and contains a database import/export utility to facilitate sharing of compressed databases (and plot configurations) between users. Any database can be populated by the user with one or more genomes by importing the genome sequence as one or more FASTA-format files. Additional genome coordinate-based data can then be uploaded in the commonly used GFF, BED, or WIG formats or using MochiView's own custom file formats. Tips for setting up a database are provided on the MochiView website.</p>
</sec>
<sec><st><p>Results and discussion</p></st>
<p>MochiView serves as both a motif analysis platform and a feature-rich genome browser, and integrates these features to allow the visualization of motifs across a genome plot and the refinement of motif analyses using data imported by the user into the MochiView database (for example, genome alignments, ChIP data, or expression data). While many of the tools provided in MochiView were designed with ChIP-Seq and ChIP-Chip data visualization in mind, the open and flexible data format allows the import and visualization of any data that have a genomic context (for example, high-throughput RNA sequencing data). MochiView is user-friendly, and is accessible to scientists with no programming knowledge. MochiView's many features are extensively documented with a tutorial walkthrough, a detailed manual, and extensive popup text support within the software. While many of MochiView's individual features are available in existing software, no existing software package, to our knowledge, integrates such a large assortment of motif and data analysis utilities together with a highly configurable genome browser in a single desktop application. The most similar existing package, CisGenome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, provides a greater emphasis on processing of raw ChIP-Chip and ChIP-Seq data and peak-finding, but is limited with respect to the scope and ease of use of the motif and data visualization and analysis options.</p>
<sec><st><p>Visualizing data across the genome</p></st>
<p>MochiView uses an integrated local database to manage all of the data imported by the user, such as genome sequences and alignments, gene locations, microarray probe locations, expression data, ChIP data, and motif libraries. As shown in Figure <figr fid="F1">1</figr>, MochiView allows many types of data to be displayed along the genome (the x-axis of the plot) in easily customized plots. Open plot tabs persist when the software is closed and reopened, and the display settings can be saved for later use. While the core design of MochiView's plots was inspired by the UCSC Genome Browser project <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, MochiView places an added emphasis on aesthetics, data browsing, and plot interactivity, and provides a rich interface for configuring plot layout.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>A sample MochiView screenshot, demonstrating many of the available display formats</p></caption><text>
   <p><b>A sample MochiView screenshot, demonstrating many of the available display formats</b>. A 20 kb span of the <it>Candida albicans </it>genome is displayed. <b>(a) </b>Two line graphs utilizing the same y-axis and representing experimental (red) and control (blue) ChIP-chip enrichment data for the Zap1 transcription factor <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. <b>(b) </b>A gene track, including color-coded data representing log<sub>2</sub>-transformed expression values from a microarray experiment. This experiment compares a wild-type strain to a Zap1 deletion strain. Note that red indicates the highest expression change; <it>ZRT2 </it>is likely to be a direct target of Zap1. Gene tracks can also display genes containing multiple isoforms and coding- and non-coding exons (not shown). <b>(c) </b>A bar graph track, demonstrating an alternate means of displaying the experimental ChIP-chip data represented by the red line graph in (A). <b>(d) </b>A region marker track, depicting a ChIP binding region and -log<sub>10</sub>-transformed <it>P</it>-value. <b>(e) </b>An RNA sequencing track, depicting mock data mapped to the plus strand (blue) and minus strand (orange). <b>(f) </b>A motif track, depicting the motif match scores of instances of four different DNA motifs, each assigned to a different color. <b>(g) </b>A multiple genome alignment track (several species of yeast), shaded to represent the level of conservation. <b>(h) </b>A line graph track, representing the GC-content of the DNA. <b>(i) </b>The data browser, which displays the contents of the database in an interactive table. Clicking on a row in the table centers the plot on the corresponding region. <b>(j) </b>Additional features become evident as the plot is zoomed in. Shown here are close-ups of the motif and alignment tracks (F and G, respectively).</p>
</text><graphic file="1741-7007-8-49-1"/></fig>
<p>Landmarks across a genome (such as the locations of microarray probes) are displayed by <it>region markers </it>(Figure <figr fid="F1">1D</figr>). Overlapping markers can be displayed as <it>stack tracks </it>with one region marker positioned above the other. Numerical data, such as ChIP-Chip enrichment levels, can be displayed in MochiView using line or bar plots. These data sets can be plotted on a common y-axis (Figure <figr fid="F1">1A</figr>) or each set can be plotted on its own y-axis (Figure <figr fid="F1">1C, E, H</figr>). Alternatively, numerical data can be displayed as text on a region marker (Figure <figr fid="F1">1D</figr>), and the marker can be colored according to the value (a useful means, for example, of visualizing expression data on genes; see Figure <figr fid="F1">1B</figr>). Sequences matching DNA motifs are identified using a user-defined scoring threshold and are displayed in additional tracks (Figure <figr fid="F1">1F</figr>). Multiple genome alignments, either genomes from closely related species or from individuals of the same species, can also be displayed (Figure <figr fid="F1">1G</figr>), providing the means to quickly visualize whether a motif match is conserved across closely related genomes (phylogenetic footprinting; see Cliften <it>et al. </it><abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and Kellis <it>et al. </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp>), or whether it varies in interesting ways.</p>
</sec>
<sec><st><p>Tools for browsing and interacting with data in a plot</p></st>
<p>MochiView provides tools for browsing the genome by sequence or by data set. The sequence browser can be used to search and highlight specific DNA sequences, degenerate DNA sequences (using symbols established by the International Union of Pure and Applied Chemistry), and direct or inverted repeats, with or without gaps. The data browser (Figure <figr fid="F1">1I</figr>) allows the user to sort and search any data set and rapidly jump from location to location across the genome using hotkeys. For example, this feature allows the user to quickly browse among regions of ChIP enrichment above a user-specified threshold value to rapidly visualize the most significant binding regions. These can then be searched for matches to a particular DNA motif.</p>
<p>MochiView plots are interactive and allow smooth panning along chromosomes and smooth zooming in and out. As one continues to zoom in, the DNA sequence itself eventually becomes visible. Virtually every element in a plot provides descriptive popup text, and annotation can be added to locations within tracks. In addition, clicking on any item in a plot copies the sequence to the clipboard, a useful tool for quickly capturing sequences for use in another application. To aid the user in filtering large sets of data, an <it>Edit Mode </it>track can be created and used to toggle a region marker between three states (true/false/undecided). For example, this feature is useful for flagging and ignoring likely false positives in a set of ChIP binding data.</p>
<p>MochiView's motif and multiple genome alignment tracks (Figures <figr fid="F1">1F</figr> and <figr fid="F1">1G</figr>, respectively) are also interactive. Motif tracks show either the match scores of motif instances (distant zoom) or the motif logo itself (close zoom; top of Figure <figr fid="F1">1J</figr>). Double-clicking the motif instance opens a window juxtaposing the motif logo with the actual genome sequence. Multiple genome alignments are displayed as either an overview shaded by conservation level (distant zoom) or as the specific aligned sequences, including inserts and gaps (close zoom; bottom of Figure <figr fid="F1">1J</figr>). Clicking on the alignments, or on the carets representing inserts in the alignment, copies the regional alignment to the clipboard.</p>
</sec>
<sec><st><p>ChIP analysis highlights many of MochiView's utilities</p></st>
<p>MochiView can serve as a central hub for data storage and visualization, from which data can easily be imported and exported for manipulation with other applications. In addition, MochiView contains a number of specific tools designed to analyze genomic and motif data. While a description of all of the utilities provided in MochiView is beyond the scope of this article, we discuss a few of them in the context of analyzing ChIP data for proteins that recognize specific DNA sequences. We focus on two stages of analysis: (1) visualization of the primary ChIP data and assessment/refinement of the <it>binding region calls</it>, and (2) identification and characterization of regulatory motifs found within the refined binding regions. We define a <it>binding region </it>as a set of genomic coordinates that identify the boundaries of a region of ChIP DNA enrichment, typically associated with some measure of confidence, such as a <it>P</it>-value. Obviously, proper control experiments are crucial to evaluate the biological relevance of a binding region, a topic discussed in more detail below.</p>
</sec>
<sec><st><p>Visualizing and refining ChIP data in MochiView</p></st>
<p>The first step of ChIP data analysis in MochiView is typically the import of raw data (ChIP-Chip enrichment or ChIP-Seq reads) as well as the binding region calls (<it>peak calls</it>). MochiView does not supply a comprehensive binding region assignment algorithm (a more limited peak extraction/refinement utility is provided), as approaches to calling binding regions are constantly being refined; moreover, the approaches for calling peaks vary with the platform used to analyze the precipitated DNA. For example, Agilent supplies peak-calling software optimized for its array design. It is, however, straightforward to import peak-calling results from existing software using MochiView's import utilities, which support several different file formats. For small genomes, it is also possible to hand-curate ChIP data in MochiView, bypassing the peak-calling programs entirely.</p>
<p>Once the relevant raw data (ChIP-Chip enrichment or ChIP-Seq reads) and binding region calls are imported, MochiView can be used to visualize them in the context of other genomic information. For example, ChIP data can be viewed in a plot in conjunction with control ChIP experiments, gene expression data, sequence GC-enrichment, histone modifications, and motifs. The <it>snapshot </it>utility allows the user to create individual images (or a single pdf) of the plot centered at every binding region in the data set. This feature is particularly useful for records in laboratory notebooks or figures for manuscripts.</p>
<p>For those data sets with a manageable number of binding regions, it is possible to visually inspect each binding region and eliminate clear false positives (and re-evaluate possible false negatives) that result from the limitations of binding site detection algorithms. Since MochiView can display multiple data sets on the same y-axis, the user can easily overlay multiple replicates of experimental ChIP data as well as control data sets (for example, ChIP in a deletion or RNAi-depleted strain or in a strain lacking the epitope tag targeted for immunoprecipitation). These data can then be quickly surveyed using the data browser and an <it>Edit Mode </it>track, and binding regions considered spurious (for example, those also observed in control experiments) or unreliable (for example, those observed in only one experimental replicate) can be flagged and then filtered using one of MochiView's data refinement utilities.</p>
<p>MochiView provides numerous additional utilities for the analysis and manipulation of sets of locations. Set operation utilities can take the union, intersection, or subtraction of two location sets, thus providing a simple mechanism for manipulating positional data. For example, the user can merge the binding region calls of experimental replicates, take the intersection of binding regions with promoter regions, take the intersection of sets of ChIP experiments performed with different transcription factors, or easily eliminate binding region calls that overlap with regions found in a control experiment. Another utility assigns binding regions to one or more genes (based on user-defined criteria), and another surveys whether these genes are enriched for Gene Ontology (GO) terms (using an approach based on the software GO TermFinder <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>). Thus, within minutes of importing ChIP data into MochiView, a user can obtain an overview of the cellular processes and genes predicted to be regulated by the transcription factor of interest. An important goal of many ChIP-Chip and ChIP-Seq experiments is the identification of the DNA motif recognized by the transcription factor of interest, and, as described next, MochiView provides numerous tools for the discovery, validation, and comparison of motifs.</p>
</sec>
<sec><st><p>Identifying and analyzing motifs in MochiView</p></st>
<p>We use the term <it>motif </it>to mean a set of short DNA sequences represented by a position-specific weight matrix, and define a <it>motif match </it>as a particular DNA sequence in a genome that is statistically similar to a motif. Several options are provided for scoring a DNA sequence for matches to a motif, including logarithm of odds (LOD) scores (reviewed in <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>), affinity scores (for affinity motifs generated by MatrixREDUCE <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>), and <it>P</it>-values derived from LOD scores (using the compound importance sampling algorithm of Barash et al. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>). In addition to finding particular matches to a motif within a sequence, MochiView can also generate a cumulative motif enrichment score for a full sequence using either a simple cumulative LOD score or a Hidden Markov Model approach (w-score, as described by Sinha et al. <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>). Figure <figr fid="F2">2</figr> provides an overview of the many utilities provided in MochiView for the visualization, management, and analysis of motifs. (These tools are not specifically tied to ChIP-Chip and ChIP-Seq analysis; they can be used in any context.) Motifs in MochiView are visualized as logos, using a format based on the sequence logo design originally described by Schneider and Stephens <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. The MochiView database provides a convenient means to maintain and annotate a library of motifs (Figure <figr fid="F2">2A</figr>), and these motifs can easily be exported as frequency matrices or logos (Figure <figr fid="F2">2B</figr>). Several motif libraries, derived from a broad range of organisms including yeast <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>, nematode <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, human <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>, and mouse <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, are provided at the MochiView website in a format this is simple to import into MochiView. This collection includes one of the largest curated motif libraries, over 1,300 motifs, provided courtesy of the JASPAR database <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Additional motifs devised by the user are also easy to import into MochiView.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>An overview of MochiView's regulatory motif analysis and management tools</p></caption><text>
   <p><b>An overview of MochiView's regulatory motif analysis and management tools</b>. (a) MochiView provides a simple interface for browsing and annotating a motif library. <b>(b) </b>MochiView provides numerous utilities for importing and exporting motif frequency matrices and logos, including support for motifs based on degenerate DNA sequences, frequency matrices, or affinity matrices (as produced by the program MatrixREDUCE <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>). <b>(c) </b>MochiView contains a motif detection utility that can identify <it>de novo </it>motifs enriched in user-defined regions. <b>(d) </b>A motif comparison tool identifies similarities between newly discovered motifs and those in the motif library. <b>(e) </b>Two utilities are provided for analyzing motif enrichment in sets of user-defined regions. <b>(f) </b>Utilities are provided for detecting non-random distribution of motifs relative to either a set of user-defined locations (for example, start codons or peaks of ChIP enrichment) or strong instances of another motif (for example, co-occurring motifs that are typically separated by a 25 bp gap). <b>(g) </b>Several utilities are provided for scoring motifs against user-defined regions. For example, it is relatively simple to output a file containing the top motif score upstream of each gene for every motif in the library. <b>(h) </b>Enrichment for Gene Ontology terms can be determined for genes with upstream sequence that contains a strong instance of a motif.</p>
</text><graphic file="1741-7007-8-49-2"/></fig>
<p>MochiView provides a motif detection utility (Figure <figr fid="F2">2C</figr>) that can identify motifs <it>de novo </it>using a Gibbs sampling technique (based on algorithms described by Thijs <it>et al. </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp> and the BioJava <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> online cookbook; implementation details are provided in the manual). The user can limit a search to specific locations (for example, binding region calls from a ChIP experiment) or search the upstream regions of a list of specific genes. It is also possible to specify that a motif occurrence must be conserved across closely related genomes. The features of MochiView also allow the user to rapidly conduct motif searches based on more complex queries. For example, the user could chain together utilities to search for motifs in the portions of binding regions that (1) overlap with intergenic regions, (2) are within 200 bp of a peak of ChIP enrichment, (3) do not overlap with areas of enrichment in the control experiment, and (4) neighbor a gene that changes expression when the transcription factor of interest is deleted (or reduced in expression by RNAi) or overexpressed. As an alternative to the built-in motif detection utility, the user can also export a set of sequences of interest (for example, those that lie within 200 bp of a peak of ChIP enrichment), apply a different motif-finding algorithm, and import the results back into MochiView. MochiView supports multiple motif file formats, including the output of the commonly used motif detection applications MEME <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and Bioprospector <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>.</p>
<p>Often, the first step in the analysis of a newly discovered motif is a determination of whether the motif resembles any known motifs. Motif libraries, such as those provided at the MochiView website, can be compared against newly discovered motifs using the motif comparison utility (Figure <figr fid="F2">2D</figr>), which generates a similarity metric based on the algorithm used by the software TomTom <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. This utility allows rapid determination of whether a discovered motif is novel, previously identified, or closely related to a motif of a different species.</p>
<p>Another common query in motif analysis is the extent to which a motif is enriched in the DNA precipitated in a given ChIP experiment (or set of experiments). In other words, how well can the motif predict the ChIP data? The motif enrichment utilities (Figure <figr fid="F2">2E</figr>) allow rapid assessment of motif enrichment at incremental score cutoffs for sets of locations such as binding regions or intergenic regions. To assess their significance, the levels of enrichment can be compared to those of a set of control locations (for example, comparison of upstream regions that include ChIP peaks versus those that do not). This analysis can also be conducted on every motif in the library, allowing the user to identify all known motifs that are enriched in the locations of interest.</p>
<p>Motif analysis often identifies several candidate DNA motifs that may be recognized by the transcription factor of interest. In the simplest cases, where the transcription factor directly recognizes a motif, the motif is predicted to lie under the center of the peak of ChIP enrichment. In other cases, a motif may be significantly enriched in a set of binding regions, not because it is recognized by the transcription factor of interest, but rather because it is bound by a different protein that regulates a similar set of genes. These alternatives can be tested using MochiView's motif distribution utilities (Figure <figr fid="F2">2F</figr>), which test for non-random positional distribution using a statistical test for non-uniform distribution described by Casimiro <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. These utilities can also identify non-random spacing between genomic matches to DNA motifs (for example, two DNA motifs, either the same or different, with matches that are typically separated by a 30 to 50 bp gap).</p>
<p>Once a compelling motif has been identified from a set of ChIP data, the motif can be explored using the MochiView motif scoring utilities (Figure <figr fid="F2">2G</figr>) and the plot browser to identify instances of a motif that occur in intergenic regions but not within the binding regions called by the ChIP-analysis algorithm. Such analysis can reveal whether the motif is necessary and sufficient to describe the binding of the transcription factor of interest. For example, such analysis may identify a set of genes that is likely to be controlled by the transcription factor but is not bound by the protein under the conditions or in the cell types used for the ChIP analysis.</p>
<p>We described above how MochiView's GO term enrichment utility could connect ChIP data to specific cellular processes. This same strategy can be used to search the upstream regions of genes for strong matches to a motif and associate that motif with one or more GO terms (Figure <figr fid="F2">2H</figr>). This approach can provide insight into the biological role of the transcription factor and further validate the motif's biological relevance.</p>
</sec>
</sec>
<sec><st><p>Conclusions</p></st>
<p>In summary, MochiView was developed to solve problems we encountered in our basic research efforts, allowing us to integrate different types of genomic data and analyses in a single format where biological correlations and insights <it>popped </it>out from the screen. We believe the software will be useful to members of many other basic research laboratories who have encountered similar challenges when interpreting and analyzing data on a genomic scale.</p>
</sec>
<sec><st><p>Availability and requirements</p></st>
<p><b>Project name</b>: MochiView.</p>
<p><b>Project home page</b>: <url>http://johnsonlab.ucsf.edu</url>.</p>
<p><b>Operating system(s)</b>: Platform independent.</p>
<p><b>Programming language</b>: Java.</p>
<p><b>Other requirements</b>: Java 1.6 or higher, minimum 1GB memory, 1024 &#215; 768 or higher screen resolution.</p>
<p><b>License</b>: MochiView is available in source and executable forms, without fee, for academic, non-profit and commercial users.</p>
<p><b>Any restrictions to use by non-academics</b>: None beyond the general restriction against redistribution in the license.</p>
</sec>
<sec><st><p>Abbreviations</p></st>
<p>BED: <b><it>B</it></b>rowser <b><it>E</it></b>xtensible <b><it>D</it></b>ata; ChIP: <b><it>ch</it></b>romatin <b><it>i</it></b>mmuno<b><it>p</it></b>recipitation; ChIP-Seq: ChIP analyzed using DNA <b><it>seq</it></b>uencing; ChIP-Chip: ChIP analyzed using tiling microarrays; GFF: <b><it>G</it></b>eneral <b><it>F</it></b>eature <b><it>F</it></b>ormat; GO: Gene Ontology; WIG: <b><it>W</it></b>iggle format.</p>
</sec>
<sec><st><p>Authors' contributions</p></st>
<p>ORH designed and wrote the software, with support from ADJ. Both authors contributed to the writing of the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</bdy>
<bm>
<ack><sec><st><p>Acknowledgements</p></st>
<p>We thank B. Tuch for developing the initial concept and design for displaying ChIP and motif data in Java, and David Gilbert for the JFreeChart Java library utilized by MochiView's plots. We are grateful to the creators of the UCSC genome browser for database design inspiration. We also thank L. Booth, C. Cain, S. Cooper, P. Fordyce, S. French, A. Hernday, Q. Mitrovich, C. Nobile, and M. Voorhies for software testing and helpful suggestions. This work was supported by NIH grant 5R01GM37049-22 to ADJ.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>An integrated software system for analyzing ChIP-chip and ChIP-seq data</p></title><aug><au><snm>Ji</snm><fnm>H</fnm></au><au><snm>Jiang</snm><fnm>H</fnm></au><au><snm>Ma</snm><fnm>W</fnm></au><au><snm>Johnson</snm><fnm>DS</fnm></au><au><snm>Myers</snm><fnm>RM</fnm></au><au><snm>Wong</snm><fnm>WH</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2008</pubdate><volume>26</volume><fpage>1293</fpage><lpage>1300</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt.1505</pubid><pubid idtype="pmcid">2596672</pubid><pubid idtype="pmpid">18978777</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>The human genome browser at UCSC</p></title><aug><au><snm>Kent</snm><fnm>WJ</fnm></au><au><snm>Sugnet</snm><fnm>CW</fnm></au><au><snm>Furey</snm><fnm>TS</fnm></au><au><snm>Roskin</snm><fnm>KM</fnm></au><au><snm>Pringle</snm><fnm>TH</fnm></au><au><snm>Zahler</snm><fnm>AM</fnm></au><au><snm>Haussler</snm><fnm>D</fnm></au></aug><source>Genome Res</source><pubdate>2002</pubdate><volume>12</volume><fpage>996</fpage><lpage>1006</lpage><xrefbib><pubidlist><pubid idtype="pmcid">186604</pubid><pubid idtype="pmpid">12045153</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Finding functional features in Saccharomyces genomes by phylogenetic footprinting</p></title><aug><au><snm>Cliften</snm><fnm>P</fnm></au><au><snm>Sudarsanam</snm><fnm>P</fnm></au><au><snm>Desikan</snm><fnm>A</fnm></au><au><snm>Fulton</snm><fnm>L</fnm></au><au><snm>Fulton</snm><fnm>B</fnm></au><au><snm>Majors</snm><fnm>J</fnm></au><au><snm>Waterston</snm><fnm>R</fnm></au><au><snm>Cohen</snm><fnm>BA</fnm></au><au><snm>Johnston</snm><fnm>M</fnm></au></aug><source>Science</source><pubdate>2003</pubdate><volume>301</volume><fpage>71</fpage><lpage>76</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1084337</pubid><pubid idtype="pmpid" link="fulltext">12775844</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Sequencing and comparison of yeast species to identify genes and regulatory elements</p></title><aug><au><snm>Kellis</snm><fnm>M</fnm></au><au><snm>Patterson</snm><fnm>N</fnm></au><au><snm>Endrizzi</snm><fnm>M</fnm></au><au><snm>Birren</snm><fnm>B</fnm></au><au><snm>Lander</snm><fnm>ES</fnm></au></aug><source>Nature</source><pubdate>2003</pubdate><volume>423</volume><fpage>241</fpage><lpage>254</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature01644</pubid><pubid idtype="pmpid" link="fulltext">12748633</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>GO:TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes</p></title><aug><au><snm>Boyle</snm><fnm>EI</fnm></au><au><snm>Weng</snm><fnm>S</fnm></au><au><snm>Gollub</snm><fnm>J</fnm></au><au><snm>Jin</snm><fnm>H</fnm></au><au><snm>Botstein</snm><fnm>D</fnm></au><au><snm>Cherry</snm><fnm>JM</fnm></au><au><snm>Sherlock</snm><fnm>G</fnm></au></aug><source>Bioinformatics</source><pubdate>2004</pubdate><volume>20</volume><fpage>3710</fpage><lpage>3715</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/bth456</pubid><pubid idtype="pmpid" link="fulltext">15297299</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>What are DNA sequence motifs?</p></title><aug><au><snm>D&apos;Haeseleer</snm><fnm>P</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2006</pubdate><volume>24</volume><fpage>423</fpage><lpage>425</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt0406-423</pubid><pubid idtype="pmpid" link="fulltext">16601727</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE</p></title><aug><au><snm>Foat</snm><fnm>BC</fnm></au><au><snm>Morozov</snm><fnm>AV</fnm></au><au><snm>Bussemaker</snm><fnm>HJ</fnm></au></aug><source>Bioinformatics</source><pubdate>2006</pubdate><volume>22</volume><fpage>e141</fpage><lpage>149</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl223</pubid><pubid idtype="pmpid" link="fulltext">16873464</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>CIS: compound importance sampling method for protein-DNA binding site p-value estimation</p></title><aug><au><snm>Barash</snm><fnm>Y</fnm></au><au><snm>Elidan</snm><fnm>G</fnm></au><au><snm>Kaplan</snm><fnm>T</fnm></au><au><snm>Friedman</snm><fnm>N</fnm></au></aug><source>Bioinformatics</source><pubdate>2005</pubdate><volume>21</volume><fpage>596</fpage><lpage>600</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/bti041</pubid><pubid idtype="pmpid" link="fulltext">15454407</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>On counting position weight matrix matches in a sequence, with application to discriminative motif finding</p></title><aug><au><snm>Sinha</snm><fnm>S</fnm></au></aug><source>Bioinformatics</source><pubdate>2006</pubdate><volume>22</volume><fpage>e454</fpage><lpage>463</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl227</pubid><pubid idtype="pmpid" link="fulltext">16873507</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Sequence logos: a new way to display consensus sequences</p></title><aug><au><snm>Schneider</snm><fnm>TD</fnm></au><au><snm>Stephens</snm><fnm>RM</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1990</pubdate><volume>18</volume><fpage>6097</fpage><lpage>6100</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/18.20.6097</pubid><pubid idtype="pmcid">332411</pubid><pubid idtype="pmpid">2172928</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>SwissRegulon: a database of genome-wide annotations of regulatory sites</p></title><aug><au><snm>Pachkov</snm><fnm>M</fnm></au><au><snm>Erb</snm><fnm>I</fnm></au><au><snm>Molina</snm><fnm>N</fnm></au><au><snm>van Nimwegen</snm><fnm>E</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2007</pubdate><volume>35</volume><fpage>D127</fpage><lpage>131</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl857</pubid><pubid idtype="pmcid">1716717</pubid><pubid idtype="pmpid">17130146</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Conservation and evolution of cis-regulatory systems in ascomycete fungi</p></title><aug><au><snm>Gasch</snm><fnm>AP</fnm></au><au><snm>Moses</snm><fnm>AM</fnm></au><au><snm>Chiang</snm><fnm>DY</fnm></au><au><snm>Fraser</snm><fnm>HB</fnm></au><au><snm>Berardini</snm><fnm>M</fnm></au><au><snm>Eisen</snm><fnm>MB</fnm></au></aug><source>PLoS Biol</source><pubdate>2004</pubdate><volume>2</volume><fpage>e398</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0020398</pubid><pubid idtype="pmcid">526180</pubid><pubid idtype="pmpid">15534694</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters</p></title><aug><au><snm>Badis</snm><fnm>G</fnm></au><au><snm>Chan</snm><fnm>ET</fnm></au><au><snm>van Bakel</snm><fnm>H</fnm></au><au><snm>Pena-Castillo</snm><fnm>L</fnm></au><au><snm>Tillo</snm><fnm>D</fnm></au><au><snm>Tsui</snm><fnm>K</fnm></au><au><snm>Carlson</snm><fnm>CD</fnm></au><au><snm>Gossett</snm><fnm>AJ</fnm></au><au><snm>Hasinoff</snm><fnm>MJ</fnm></au><au><snm>Warren</snm><fnm>CL</fnm></au><au><snm>Gebbia</snm><fnm>M</fnm></au><au><snm>Talukder</snm><fnm>S</fnm></au><au><snm>Yang</snm><fnm>A</fnm></au><au><snm>Mnaimneh</snm><fnm>S</fnm></au><au><snm>Terterov</snm><fnm>D</fnm></au><au><snm>Coburn</snm><fnm>D</fnm></au><au><snm>Li Yeo</snm><fnm>A</fnm></au><au><snm>Yeo</snm><fnm>ZX</fnm></au><au><snm>Clarke</snm><fnm>ND</fnm></au><au><snm>Lieb</snm><fnm>JD</fnm></au><au><snm>Ansari</snm><fnm>AZ</fnm></au><au><snm>Nislow</snm><fnm>C</fnm></au><au><snm>Hughes</snm><fnm>TR</fnm></au></aug><source>Mol Cell</source><pubdate>2008</pubdate><volume>32</volume><fpage>878</fpage><lpage>887</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.molcel.2008.11.020</pubid><pubid idtype="pmcid">2743730</pubid><pubid idtype="pmpid">19111667</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>MotifVoter: a novel ensemble method for fine-grained integration of generic motif finders</p></title><aug><au><snm>Wijaya</snm><fnm>E</fnm></au><au><snm>Yiu</snm><fnm>SM</fnm></au><au><snm>Son</snm><fnm>NT</fnm></au><au><snm>Kanagasabai</snm><fnm>R</fnm></au><au><snm>Sung</snm><fnm>WK</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><fpage>2288</fpage><lpage>2295</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn420</pubid><pubid idtype="pmpid" link="fulltext">18697768</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Transcriptional regulatory code of a eukaryotic genome</p></title><aug><au><snm>Harbison</snm><fnm>CT</fnm></au><au><snm>Gordon</snm><fnm>DB</fnm></au><au><snm>Lee</snm><fnm>TI</fnm></au><au><snm>Rinaldi</snm><fnm>NJ</fnm></au><au><snm>Macisaac</snm><fnm>KD</fnm></au><au><snm>Danford</snm><fnm>TW</fnm></au><au><snm>Hannett</snm><fnm>NM</fnm></au><au><snm>Tagne</snm><fnm>JB</fnm></au><au><snm>Reynolds</snm><fnm>DB</fnm></au><au><snm>Yoo</snm><fnm>J</fnm></au><au><snm>Jennings</snm><fnm>EG</fnm></au><au><snm>Zeitlinger</snm><fnm>J</fnm></au><au><snm>Pokholok</snm><fnm>DK</fnm></au><au><snm>Kellis</snm><fnm>M</fnm></au><au><snm>Rolfe</snm><fnm>PA</fnm></au><au><snm>Takusagawa</snm><fnm>KT</fnm></au><au><snm>Lander</snm><fnm>ES</fnm></au><au><snm>Gifford</snm><fnm>DK</fnm></au><au><snm>Fraenkel</snm><fnm>E</fnm></au><au><snm>Young</snm><fnm>RA</fnm></au></aug><source>Nature</source><pubdate>2004</pubdate><volume>431</volume><fpage>99</fpage><lpage>104</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature02800</pubid><pubid idtype="pmpid" link="fulltext">15343339</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>An improved map of conserved regulatory sites for Saccharomyces cerevisiae</p></title><aug><au><snm>MacIsaac</snm><fnm>KD</fnm></au><au><snm>Wang</snm><fnm>T</fnm></au><au><snm>Gordon</snm><fnm>DB</fnm></au><au><snm>Gifford</snm><fnm>DK</fnm></au><au><snm>Stormo</snm><fnm>GD</fnm></au><au><snm>Fraenkel</snm><fnm>E</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2006</pubdate><volume>7</volume><fpage>113</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-7-113</pubid><pubid idtype="pmcid">1435934</pubid><pubid idtype="pmpid">16522208</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>High-resolution DNA-binding specificity analysis of yeast transcription factors</p></title><aug><au><snm>Zhu</snm><fnm>C</fnm></au><au><snm>Byers</snm><fnm>KJ</fnm></au><au><snm>McCord</snm><fnm>RP</fnm></au><au><snm>Shi</snm><fnm>Z</fnm></au><au><snm>Berger</snm><fnm>MF</fnm></au><au><snm>Newburger</snm><fnm>DE</fnm></au><au><snm>Saulrieta</snm><fnm>K</fnm></au><au><snm>Smith</snm><fnm>Z</fnm></au><au><snm>Shah</snm><fnm>MV</fnm></au><au><snm>Radhakrishnan</snm><fnm>M</fnm></au><au><snm>Philippakis</snm><fnm>AA</fnm></au><au><snm>Hu</snm><fnm>Y</fnm></au><au><snm>De Masi</snm><fnm>F</fnm></au><au><snm>Pacek</snm><fnm>M</fnm></au><au><snm>Rolfs</snm><fnm>A</fnm></au><au><snm>Murthy</snm><fnm>T</fnm></au><au><snm>Labaer</snm><fnm>J</fnm></au><au><snm>Bulyk</snm><fnm>ML</fnm></au></aug><source>Genome Res</source><pubdate>2009</pubdate><volume>19</volume><fpage>556</fpage><lpage>566</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.090233.108</pubid><pubid idtype="pmcid">2665775</pubid><pubid idtype="pmpid">19158363</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>JASPAR: an open-access database for eukaryotic transcription factor binding profiles</p></title><aug><au><snm>Sandelin</snm><fnm>A</fnm></au><au><snm>Alkema</snm><fnm>W</fnm></au><au><snm>Engstrom</snm><fnm>P</fnm></au><au><snm>Wasserman</snm><fnm>WW</fnm></au><au><snm>Lenhard</snm><fnm>B</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2004</pubdate><volume>32</volume><fpage>D91</fpage><lpage>94</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkh012</pubid><pubid idtype="pmcid">308747</pubid><pubid idtype="pmpid">14681366</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update</p></title><aug><au><snm>Bryne</snm><fnm>JC</fnm></au><au><snm>Valen</snm><fnm>E</fnm></au><au><snm>Tang</snm><fnm>MH</fnm></au><au><snm>Marstrand</snm><fnm>T</fnm></au><au><snm>Winther</snm><fnm>O</fnm></au><au><snm>da Piedade</snm><fnm>I</fnm></au><au><snm>Krogh</snm><fnm>A</fnm></au><au><snm>Lenhard</snm><fnm>B</fnm></au><au><snm>Sandelin</snm><fnm>A</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><fpage>D102</fpage><lpage>106</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkm955</pubid><pubid idtype="pmcid">2238834</pubid><pubid idtype="pmpid">18006571</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors</p></title><aug><au><snm>Grove</snm><fnm>CA</fnm></au><au><snm>De Masi</snm><fnm>F</fnm></au><au><snm>Barrasa</snm><fnm>MI</fnm></au><au><snm>Newburger</snm><fnm>DE</fnm></au><au><snm>Alkema</snm><fnm>MJ</fnm></au><au><snm>Bulyk</snm><fnm>ML</fnm></au><au><snm>Walhout</snm><fnm>AJ</fnm></au></aug><source>Cell</source><pubdate>2009</pubdate><volume>138</volume><fpage>314</fpage><lpage>327</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2009.04.058</pubid><pubid idtype="pmcid">2774807</pubid><pubid idtype="pmpid" link="fulltext">19632181</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals</p></title><aug><au><snm>Xie</snm><fnm>X</fnm></au><au><snm>Lu</snm><fnm>J</fnm></au><au><snm>Kulbokas</snm><fnm>EJ</fnm></au><au><snm>Golub</snm><fnm>TR</fnm></au><au><snm>Mootha</snm><fnm>V</fnm></au><au><snm>Lindblad-Toh</snm><fnm>K</fnm></au><au><snm>Lander</snm><fnm>ES</fnm></au><au><snm>Kellis</snm><fnm>M</fnm></au></aug><source>Nature</source><pubdate>2005</pubdate><volume>434</volume><fpage>338</fpage><lpage>345</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature03441</pubid><pubid idtype="pmcid">2923337</pubid><pubid idtype="pmpid" link="fulltext">15735639</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Information for the Coordinates of Exons (ICE): a human splice sites database</p></title><aug><au><snm>Chong</snm><fnm>A</fnm></au><au><snm>Zhang</snm><fnm>G</fnm></au><au><snm>Bajic</snm><fnm>VB</fnm></au></aug><source>Genomics</source><pubdate>2004</pubdate><volume>84</volume><fpage>762</fpage><lpage>766</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ygeno.2004.05.007</pubid><pubid idtype="pmpid" link="fulltext">15475254</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Diversity and complexity in DNA recognition by transcription factors</p></title><aug><au><snm>Badis</snm><fnm>G</fnm></au><au><snm>Berger</snm><fnm>MF</fnm></au><au><snm>Philippakis</snm><fnm>AA</fnm></au><au><snm>Talukder</snm><fnm>S</fnm></au><au><snm>Gehrke</snm><fnm>AR</fnm></au><au><snm>Jaeger</snm><fnm>SA</fnm></au><au><snm>Chan</snm><fnm>ET</fnm></au><au><snm>Metzler</snm><fnm>G</fnm></au><au><snm>Vedenko</snm><fnm>A</fnm></au><au><snm>Chen</snm><fnm>X</fnm></au><au><snm>Kuznetsov</snm><fnm>H</fnm></au><au><snm>Wang</snm><fnm>CF</fnm></au><au><snm>Coburn</snm><fnm>D</fnm></au><au><snm>Newburger</snm><fnm>DE</fnm></au><au><snm>Morris</snm><fnm>Q</fnm></au><au><snm>Hughes</snm><fnm>TR</fnm></au><au><snm>Bulyk</snm><fnm>ML</fnm></au></aug><source>Science</source><pubdate>2009</pubdate><volume>324</volume><fpage>1720</fpage><lpage>1723</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1162327</pubid><pubid idtype="pmcid">2905877</pubid><pubid idtype="pmpid" link="fulltext">19443739</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences</p></title><aug><au><snm>Berger</snm><fnm>MF</fnm></au><au><snm>Badis</snm><fnm>G</fnm></au><au><snm>Gehrke</snm><fnm>AR</fnm></au><au><snm>Talukder</snm><fnm>S</fnm></au><au><snm>Philippakis</snm><fnm>AA</fnm></au><au><snm>Pena-Castillo</snm><fnm>L</fnm></au><au><snm>Alleyne</snm><fnm>TM</fnm></au><au><snm>Mnaimneh</snm><fnm>S</fnm></au><au><snm>Botvinnik</snm><fnm>OB</fnm></au><au><snm>Chan</snm><fnm>ET</fnm></au><au><snm>Khalid</snm><fnm>F</fnm></au><au><snm>Zhang</snm><fnm>W</fnm></au><au><snm>Newburger</snm><fnm>D</fnm></au><au><snm>Jaeger</snm><fnm>SA</fnm></au><au><snm>Morris</snm><fnm>QD</fnm></au><au><snm>Bulyk</snm><fnm>ML</fnm></au><au><snm>Hughes</snm><fnm>TR</fnm></au></aug><source>Cell</source><pubdate>2008</pubdate><volume>133</volume><fpage>1266</fpage><lpage>1276</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2008.05.024</pubid><pubid idtype="pmcid">2531161</pubid><pubid idtype="pmpid">18585359</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities</p></title><aug><au><snm>Berger</snm><fnm>MF</fnm></au><au><snm>Philippakis</snm><fnm>AA</fnm></au><au><snm>Qureshi</snm><fnm>AM</fnm></au><au><snm>He</snm><fnm>FS</fnm></au><au><snm>Estep</snm><fnm>PW</fnm><suf>III</suf></au><au><snm>Bulyk</snm><fnm>ML</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2006</pubdate><volume>24</volume><fpage>1429</fpage><lpage>1435</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt1246</pubid><pubid idtype="pmpid" link="fulltext">16998473</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>UniPROBE: an online database of protein binding microarray data on protein-DNA interactions</p></title><aug><au><snm>Newburger</snm><fnm>DE</fnm></au><au><snm>Bulyk</snm><fnm>ML</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>D77</fpage><lpage>82</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn660</pubid><pubid idtype="pmcid">2686578</pubid><pubid idtype="pmpid">18842628</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling</p></title><aug><au><snm>Thijs</snm><fnm>G</fnm></au><au><snm>Lescot</snm><fnm>M</fnm></au><au><snm>Marchal</snm><fnm>K</fnm></au><au><snm>Rombauts</snm><fnm>S</fnm></au><au><snm>De Moor</snm><fnm>B</fnm></au><au><snm>Rouze</snm><fnm>P</fnm></au><au><snm>Moreau</snm><fnm>Y</fnm></au></aug><source>Bioinformatics</source><pubdate>2001</pubdate><volume>17</volume><fpage>1113</fpage><lpage>1122</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/17.12.1113</pubid><pubid idtype="pmpid" link="fulltext">11751219</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>BioJava: an open-source framework for bioinformatics</p></title><aug><au><snm>Holland</snm><fnm>RC</fnm></au><au><snm>Down</snm><fnm>TA</fnm></au><au><snm>Pocock</snm><fnm>M</fnm></au><au><snm>Prlic</snm><fnm>A</fnm></au><au><snm>Huen</snm><fnm>D</fnm></au><au><snm>James</snm><fnm>K</fnm></au><au><snm>Foisy</snm><fnm>S</fnm></au><au><snm>Drager</snm><fnm>A</fnm></au><au><snm>Yates</snm><fnm>A</fnm></au><au><snm>Heuer</snm><fnm>M</fnm></au><au><snm>Schreiber</snm><fnm>MJ</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><fpage>2096</fpage><lpage>2097</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn397</pubid><pubid idtype="pmcid">2530884</pubid><pubid idtype="pmpid">18689808</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>MEME: discovering and analyzing DNA and protein sequence motifs</p></title><aug><au><snm>Bailey</snm><fnm>TL</fnm></au><au><snm>Williams</snm><fnm>N</fnm></au><au><snm>Misleh</snm><fnm>C</fnm></au><au><snm>Li</snm><fnm>WW</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2006</pubdate><volume>34</volume><fpage>W369</fpage><lpage>373</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl198</pubid><pubid idtype="pmcid">1538909</pubid><pubid idtype="pmpid">16845028</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes</p></title><aug><au><snm>Liu</snm><fnm>X</fnm></au><au><snm>Brutlag</snm><fnm>DL</fnm></au><au><snm>Liu</snm><fnm>JS</fnm></au></aug><source>Pac Symp Biocomput</source><pubdate>2001</pubdate><fpage>127</fpage><lpage>138</lpage><xrefbib><pubid idtype="pmpid">11262934</pubid></xrefbib></bibl><bibl id="B31"><title><p>Quantifying similarity between motifs</p></title><aug><au><snm>Gupta</snm><fnm>S</fnm></au><au><snm>Stamatoyannopoulos</snm><fnm>JA</fnm></au><au><snm>Bailey</snm><fnm>TL</fnm></au><au><snm>Noble</snm><fnm>WS</fnm></au></aug><source>Genome Biol</source><pubdate>2007</pubdate><volume>8</volume><fpage>R24</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2007-8-2-r24</pubid><pubid idtype="pmcid">1852410</pubid><pubid idtype="pmpid">17324271</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance</p></title><aug><au><snm>Casimiro</snm><fnm>AC</fnm></au><au><snm>Vinga</snm><fnm>S</fnm></au><au><snm>Freitas</snm><fnm>AT</fnm></au><au><snm>Oliveira</snm><fnm>AL</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>89</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-89</pubid><pubid idtype="pmcid">2375121</pubid><pubid idtype="pmpid">18257925</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>Biofilm matrix regulation by Candida albicans Zap1</p></title><aug><au><snm>Nobile</snm><fnm>CJ</fnm></au><au><snm>Nett</snm><fnm>JE</fnm></au><au><snm>Hernday</snm><fnm>AD</fnm></au><au><snm>Homann</snm><fnm>OR</fnm></au><au><snm>Deneault</snm><fnm>JS</fnm></au><au><snm>Nantel</snm><fnm>A</fnm></au><au><snm>Andes</snm><fnm>DR</fnm></au><au><snm>Johnson</snm><fnm>AD</fnm></au><au><snm>Mitchell</snm><fnm>AP</fnm></au></aug><source>PLoS Biol</source><pubdate>2009</pubdate><volume>7</volume><fpage>e1000133</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.1000133</pubid><pubid idtype="pmcid">2688839</pubid><pubid idtype="pmpid">19529758</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm>
</art>