<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-8-r67</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Patterns of intron sequence evolution in <it>Drosophila </it>are dependent upon length and GC content</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Haddrill</snm>
					<mi>R</mi>
					<fnm>Penelope</fnm>
					<insr iid="I1"/>
					<email>p.haddrill@ed.ac.uk</email>
				</au>
				<au id="A2">
					<snm>Charlesworth</snm>
					<fnm>Brian</fnm>
					<insr iid="I1"/>
					<email>Brian.Charlesworth@ed.ac.uk</email>
				</au>
				<au id="A3">
					<snm>Halligan</snm>
					<mi>L</mi>
					<fnm>Daniel</fnm>
					<insr iid="I1"/>
					<email>Daniel.Halligan@ed.ac.uk</email>
				</au>
				<au id="A4">
					<snm>Andolfatto</snm>
					<fnm>Peter</fnm>
					<insr iid="I2"/>
					<email>pandolfatto@ucsd.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK</p>
				</ins>
				<ins id="I2">
					<p>Section of Ecology, Behavior and Evolution, Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>8</issue>
			<fpage>R67</fpage>
			<url>http://genomebiology.com/2005/6/8/R67</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">16086849</pubid><pubid idtype="doi">10.1186/gb-2005-6-8-r67</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>4</day>
					<month>3</month>
					<year>2005</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>25</day>
					<month>4</month>
					<year>2005</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>29</day>
					<month>6</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>27</day>
					<month>7</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Haddrill et al.; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Patterns of intron sequence evolution in <it>Drosophila</it></p>
		</shorttitle>
		<shortabs>
			<p>An analysis of inter-specific divergence in 225 intron fragments in <it>Drosophila melanogaster </it>and <it>D. simulans </it>reveals a strongly negative correlation between intron length and divergence and intron divergence and GC content. This suggests that most intronic DNA is evolving under considerable constraint.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Introns comprise a large fraction of eukaryotic genomes, yet little is known about their functional significance. Regulatory elements have been mapped to some introns, though these are believed to account for only a small fraction of genome wide intronic DNA. No consistent patterns have emerged from studies that have investigated general levels of evolutionary constraint in introns.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>We examine the relationship between intron length and levels of evolutionary constraint by analyzing inter-specific divergence at 225 intron fragments in <it>Drosophila melanogaster </it>and <it>Drosophila simulans</it>, sampled from a broad distribution of intron lengths. We document a strongly negative correlation between intron length and divergence. Interestingly, we also find that divergence in introns is negatively correlated with GC content. This relationship does not account for the correlation between intron length and divergence, however, and may simply reflect local variation in mutational rates or biases.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>Short introns make up only a small fraction of total intronic DNA in the genome. Our finding that long introns evolve more slowly than average implies that, while the majority of introns in the <it>Drosophila </it>genome may experience little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint. Our results suggest that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated. Our finding that GC content and divergence are negatively correlated in introns has important implications for the interpretation of the correlation between divergence and levels of codon bias observed in <it>Drosophila</it>.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Non-coding DNA makes up a large proportion of the genomes of most eukaryotes, yet little is known about its functional significance and the forces affecting its evolution. The identification of functional regions of the genome has tended to concentrate on coding DNA, yet the recent shift in focus towards non-coding DNA has revealed that introns and intergenic sequences may be subject to considerable levels of selective constraint, implying that they contain functional elements <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. No consistent patterns have emerged from the relatively few studies that have thus far investigated levels of constraint on intron DNA sequences; some studies conclude that such DNA is evolving under little or no selective constraint, while others find considerable levels of constraint (for examples, see <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>). Moreover, the mode of evolution for such types of sequence is still unclear.</p>
			<p>Several recent studies have attempted to estimate the proportion of sites within introns that is subject to selective constraint. For example, Jareborg <it>et al. </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp> estimate that 23% of intronic sites in mouse-rat genome comparisons are evolutionarily conserved. Similarly, Shabalina and Kondrashov <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> estimate (conservatively) that 17% of nucleotide sites within introns are selectively constrained between <it>Caenorhabditis elegans </it>and <it>Caenorhabditis briggsae</it>; this was at least in part due to their function in splicing, because constraint appeared to be higher at the edges of introns. Likewise, Bergman and Kreitman <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> estimate that 22-26% of non-coding sequences (intergenic and intronic) are highly constrained between <it>Drosophila melanogaster </it>and <it>Drosophila virilis</it>. In contrast to these studies, Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> found that most intronic sites (excluding those necessary for correct splicing) in <it>Drosophila </it>were evolving approximately 17% faster than fourfold synonymous sites. They concluded that these sites were effectively evolving free from selective constraint. The discrepancies among previous studies suggest that no clear conclusions can yet be drawn regarding the levels of selective constraint in non-coding intronic DNA.</p>
			<p>Intron size is one possible factor that may explain these conflicting results. Comeron and Kreitman <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and others have noted an asymmetrical distribution of intron lengths in <it>D. melanogaster</it>; a large number of short introns clustered around a minimal intron length and a broader distribution of longer introns (median intron size of 86 base-pairs (bp), mean intron size of 1411 bp; <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). Based on multi-species data for 15 introns (13 short and 2 long), Parsch <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> showed that there were significantly fewer substitutions per site in the two longer introns. He suggested that this pattern may be due to the presence of a greater number of regulatory elements that are subject to purifying selection in longer introns.</p>
			<p>If regulatory elements occur frequently in introns, and these are of some minimal size, it follows that size may be an important factor in intron evolution. In agreement with this prediction, Marais <it>et al. </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp> noted a marginally significant (<it>P </it>= 0.03) negative correlation between intron divergence and size for first introns (but not other introns) in the dataset of Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Marais <it>et al. </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp> suggested that this correlation between divergence and length may be expected for first introns because they are on average two times longer than other introns <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and also tend to contain more known regulatory elements, at least in mammals <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Because the dataset used consisted mostly of short introns, it is unclear whether the pattern they observed is specific to first introns (due to an association between first introns and regulatory elements) and whether the relationship between divergence and size is primarily driven by the fact that first introns are longer. Here we revisit the relationship between intron length and evolutionary constraint (as measured by levels of divergence between <it>D. melanogaster </it>and <it>D. simulans</it>) by combining published data for 225 intron fragments sampled from a much broader distribution of intron lengths and positions within genes.</p>
		</sec>
		<sec>
			<st>
				<p>Results and discussion</p>
			</st>
			<sec>
				<st>
					<p>Levels of divergence are correlated with intron length</p>
				</st>
				<p>We investigated levels of divergence at a total of 225 introns (a mixture of complete short introns and several hundred base-pair fragments of longer introns) scattered across the <it>Drosophila </it>genome. The relationship between intron length and nucleotide divergence for all complete introns and intron fragments surveyed is shown in Figure <figr fid="F1">1</figr>. A strongly negative correlation between intron length and divergence is apparent (Spearman correlation coefficient <it>R</it><sub><it>s </it></sub>= -0.388, <it>P </it>&lt; 10<sup>-4</sup>). We also divided the data into two size classes based on the median intron size of 86 bp in <it>Drosophila </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp>; small (&#8804;86 bp) introns and large (&gt;86 bp) introns. The large intron class showed significantly lower divergences than the small intron class (Wilcoxon two-sample test statistic W = 17079.5, <it>P </it>&lt; 10<sup>-4</sup>). The correlation between intron length and divergence is somewhat weaker, but still significant within the longer intron class (<it>R</it><sub><it>s </it></sub>= -0.278, <it>P </it>= 0.006).</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>The relationship between intron length and the level of divergence between <it>D. melanogaster </it>and <it>D. simulans </it>for the combined dataset of 225 introns</p>
					</caption>
					<text>
						<p>The relationship between intron length and the level of divergence between <it>D. melanogaster </it>and <it>D. simulans </it>for the combined dataset of 225 introns. A significantly negative correlation is found for all introns (Spearman correlation coefficient <it>R</it><sub><it>s </it></sub>= -0.388, <it>P </it>&lt; 10<sup>-4</sup>), first introns (<it>R</it><sub><it>s </it></sub>= -0.451, <it>P </it>&lt; 10<sup>-4</sup>) and non-first introns (<it>R</it><sub><it>s </it></sub>= -0.304, <it>P </it>&lt; 10<sup>-4</sup>).</p>
					</text>
					<graphic file="gb-2005-6-8-r67-1"/>
				</fig>
				<p>It has been noted that introns harbouring regulatory elements tend to be first introns <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr></abbrgrp>, and that first introns tend to be longer in <it>Drosophila </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Thus a relationship between intron size and divergence might only be expected for first introns <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Indeed, previous studies have failed to find evidence of constraint outside first introns <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B18">18</abbr></abbrgrp>. In Figure <figr fid="F1">1</figr>, we show that the strong correlation between divergence and intron length is not specific to first introns (first introns <it>R</it><sub><it>s </it></sub>= -0.451, <it>P </it>&lt; 10<sup>-4</sup>; non-first introns <it>R</it><sub><it>s </it></sub>= -0.304, <it>P </it>&lt; 10<sup>-4</sup>). Mean divergences were not significantly different between first and non-first introns when compared within short and long size classes (Table <tblr tid="T1">1</tblr>). These results suggest that regulatory elements may be common enough across all longer introns that constraint is independent of the position of an intron within a gene.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Mean divergence and GC content values for each class of DNA</p>
					</caption>
					<tblbdy cols="7">
						<r>
							<c>
								<p/>
							</c>
							<c cspan="3" ca="center">
								<p>Divergence</p>
							</c>
							<c cspan="3" ca="center">
								<p>GC Content</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>All</p>
							</c>
							<c ca="center">
								<p>Short*</p>
							</c>
							<c ca="center">
								<p>Long*</p>
							</c>
							<c ca="center">
								<p>All</p>
							</c>
							<c ca="center">
								<p>Short*</p>
							</c>
							<c ca="center">
								<p>Long*</p>
							</c>
						</r>
						<r>
							<c cspan="7">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Introns</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c indent="1" ca="left">
								<p>All</p>
							</c>
							<c ca="center">
								<p>0.093 (0.004)</p>
							</c>
							<c ca="center">
								<p>0.110 (0.005)</p>
							</c>
							<c ca="center">
								<p>0.070 (0.003)</p>
							</c>
							<c ca="center">
								<p>0.357 (0.006)</p>
							</c>
							<c ca="center">
								<p>0.345 (0.009)</p>
							</c>
							<c ca="center">
								<p>0.371 (0.007)</p>
							</c>
						</r>
						<r>
							<c indent="1" ca="left">
								<p>First</p>
							</c>
							<c ca="center">
								<p>0.101 (0.005)</p>
							</c>
							<c ca="center">
								<p>0.114<sup>&#8224; </sup>(0.006)</p>
							</c>
							<c ca="center">
								<p>0.072<sup>&#8224; </sup>(0.006)</p>
							</c>
							<c ca="center">
								<p>0.361 (0.010)</p>
							</c>
							<c ca="center">
								<p>0.352<sup>&#8224; </sup>(0.013)</p>
							</c>
							<c ca="center">
								<p>0.383<sup>&#8224; </sup>(0.011)</p>
							</c>
						</r>
						<r>
							<c indent="1" ca="left">
								<p>Non-first</p>
							</c>
							<c ca="center">
								<p>0.085 (0.005)</p>
							</c>
							<c ca="center">
								<p>0.105<sup>&#8224; </sup>(0.009)</p>
							</c>
							<c ca="center">
								<p>0.069<sup>&#8224; </sup>(0.004)</p>
							</c>
							<c ca="center">
								<p>0.352 (0.007)</p>
							</c>
							<c ca="center">
								<p>0.337<sup>&#8224; </sup>(0.012)</p>
							</c>
							<c ca="center">
								<p>0.365<sup>&#8224; </sup>(0.008)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Synonymous sites</p>
							</c>
							<c ca="center">
								<p>0.127 (0.019)</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>0.654 (0.014)</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>Values are mean (standard error). *Introns were divided into two classes based on the median intron length (86 bp) [14]: short, &#8804;86 bp; long, &gt;86 bp. <sup>&#8224;</sup>Divergence and GC content values did not differ between first and non-first introns when compared within long and short size classes.</p>
					</tblfn>
				</tbl>
				<p>While this is strong evidence for evolutionary constraint on longer introns, short introns do not appear to evolve much more slowly than synonymous sites in <it>Drosophila</it>. To illustrate this, Figure <figr fid="F2">2</figr> shows average divergence estimates (with two standard errors) for synonymous sites from 102 coding regions <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> compared to those for the small (&#8804;86 bp) and large (&gt;86 bp) size classes of introns. Average divergence at non-synonymous sites <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> is also shown for comparison. Synonymous site divergence is significantly higher than levels of divergence for large introns (Wilcoxon two-sample W = 7745.5, <it>P </it>&lt; 10<sup>-4</sup>) but not small introns (Wilcoxon two-sample W = 15115.5, <it>P </it>= 0.617). This finding is consistent with the conclusions of Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> that introns and synonymous sites evolve at similar rates, given that their dataset contained few long introns. One half of the introns in the genome are less than 86 base-pairs long, but these comprise only about 5% of total intronic DNA in the genome <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Thus, ironically, while the majority of introns in the <it>Drosophila </it>genome may be evolving under little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Mean divergences for non-synonymous sites, synonymous sites and both small and large introns</p>
					</caption>
					<text>
						<p>Mean divergences for non-synonymous sites, synonymous sites and both small and large introns. Mean levels of divergence between <it>D. melanogaster </it>and <it>D. simulans </it>for non-synonymous and synonymous sites of coding data, introns &#8804;86 bp and introns &gt;86 bp. Error bars indicate two standard errors. Synonymous site divergence is significantly greater than large (Wilcoxon two-sample test statistic W = 7745.5, <it>P </it>&lt; 10<sup>-4</sup>) but not small (W = 15115.5, <it>P </it>= 0.6173) intron divergences. Small intron divergence is significantly greater than large intron divergence (W = 17079.5, <it>P </it>&lt; 10<sup>-4</sup>).</p>
					</text>
					<graphic file="gb-2005-6-8-r67-2"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Divergence and base composition of introns</p>
				</st>
				<p>Introns are more AT-rich than synonymous sites in <it>Drosophila </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp> (Table <tblr tid="T1">1</tblr>). Could lower levels of divergence then be an artefact of local GC content? There is a significantly negative relationship between divergence and GC content in the intron dataset (<it>R</it><sub><it>s </it></sub>= -0.345, <it>P </it>&lt; 10<sup>-4</sup>) (Figure <figr fid="F3">3a</figr>), and a significantly positive relationship between intron length and GC content (<it>R</it><sub><it>s </it></sub>= 0.237, <it>P </it>&lt; 10<sup>-3</sup>) (Figure <figr fid="F3">3b</figr>). The partial correlation coefficient for divergence versus length, controlling for GC content, is -0.132 (95% bootstrap confidence interval: -0.192/-0.089). The partial correlations for divergence versus GC content (controlling for length) and GC content versus length (controlling for divergence) were -0.292 (-0.410/-0.168) and 0.030 (-0.037/0.120), respectively. These results suggest that the relationship between intron length and divergence is not a confounding effect of GC content, despite the negative correlation between divergence and GC content.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>The relationship between intron fragment GC content and both divergence and length</p>
					</caption>
					<text>
						<p>The relationship between intron fragment GC content and both divergence and length. <b>(a) </b>The relationship between GC content of intron fragments and divergence between <it>D. melanogaster </it>and <it>D. simulans </it>(Spearman correlation coefficient <it>R</it><sub><it>s </it></sub>= -0.345, <it>P </it>&lt; 10<sup>-4</sup>). <b>(b) </b>The relationship between GC content of intron fragments and intron length (<it>R</it><sub><it>s </it></sub>= 0.237, <it>P </it>&lt; 10<sup>-3</sup>).</p>
					</text>
					<graphic file="gb-2005-6-8-r67-3"/>
				</fig>
				<p>Similar to the pattern we observe in introns, a negative association between synonymous site substitution rates and GC content at the third position of codons has previously been noted in <it>Drosophila </it><abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and in mammals <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. This pattern at synonymous sites has been cited as evidence of selection for codon usage bias, as preferred codons are usually GC rich <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B23">23</abbr></abbrgrp>; however, selection on codon usage obviously cannot explain the same pattern in introns. The negative relationship between divergence and GC content in introns might instead reflect local variation in the extent of mutational rates or biases <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B24">24</abbr></abbrgrp>, or the effects of biased gene conversion favouring GC over AT, which mimics the effect of selection in favour of GC nucleotides <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
				<p>The possible role of mutational bias can be examined using the following method. It follows from the standard model of drift and reversible mutation that, if AT mutates to GC at rate <it>u </it>and GC mutates to AT at rate <it>ku </it>the equilibrium frequency of GC for neutral sites (neglecting polymorphic sites) is approximated by <it>p </it>= 1/(1 + <it>k</it>), and the equilibrium rate of substitutions is <it>K </it>= 2<it>uk</it>/(1+<it>k</it>) <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. This yields the relation <it>K </it>= 2<it>u</it>(1 - <it>p</it>), so that the equilibrium rate of substitution is negatively and linearly related to GC content. This formula predicts that the intercept (divergence at zero GC content) is equal to the absolute value of the slope, and so this hypothesis is testable. The regression coefficient of divergence on GC content in the complete dataset is -0.180 (-0.254/-0.106), and the corresponding intercept is 0.157 (0.115/0.163), which at first sight is consistent with the hypothesis that variation in the level of the mutational bias parameter, <it>k</it>, is sufficient to account for the relation between divergence and GC content.</p>
				<p>The relationship between divergence and length, however, makes the above test problematic, in view of the wide variation in intron length. If only the 127 short introns (length &#8804; 86 bp) are used, which are much more uniform in length, the regression of divergence on GC content is almost unchanged at -0.116 (-0.207/-0.023), and the intercept is 0.150 (0.142/0.162). Note, however, that there is a significant partial correlation of 0.166 (0.041/0.345) between GC content and length for short introns, but not for long introns, so there is still a residual relation between length and GC content in short introns. While we cannot rule out the possibility that biased gene conversion and/or selection in favour of GC versus AT explains the relationship between GC content and divergence, our analysis suggests that variation in mutational bias may be sufficient. If this process also explains the relationship between synonymous site divergence and GC content, tests for selection on codon bias based on negative correlations between codon bias and divergence (recently discussed by Bierne and Eyre-Walker <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and Dunn <it>et al</it>. <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>) lose their force. These have been criticized on other theoretical grounds by Eyre-Walker and Bulmer <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>The density of functional elements in introns</p>
				</st>
				<p>The correlation analyses strongly suggest that longer introns show lower levels of divergence, and that this is not simply caused by mutational rate differences related to GC content, although other sources of mutation rate differences cannot of course be ruled out. So why might longer introns be subject to higher levels of constraint? Introns are known to contain regulatory elements (for examples, see <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>, and see <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> for a recent review of the mammalian literature), so it is possible that longer introns are more constrained because they contain more of these elements.</p>
				<p>Are putative regulatory elements in longer introns discrete entities (such as clusters of binding sites for transcription factors), or is this regulatory function more diffuse? If intronic regulatory elements occur in clusters, surrounded by unconstrained regions, we might expect to find higher levels of divergence in the short, several hundred base-pair regions of very long introns (such as those surveyed here), compared to intermediate-sized introns, provided that they have similar total amounts of regulatory sequences. The rationale for this is that, if constrained regulatory elements are clustered into one region, short fragments of very long introns would be unlikely to coincide by chance with a functional element, whereas similarly sized regions from introns of intermediate length would be more likely to coincide with such elements. Such clustering is possible, given that transcription factor binding sites and regulatory elements can range in size from a few base-pairs up to several hundred base-pairs (for examples, see <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>). If the proportion of regulatory sequence is similar in long and intermediate introns, however, no difference in mean divergence is expected, but clustering would cause a higher variance in divergence in very long versus intermediate-length introns (after removing the binomial sampling variance). If regulatory elements in introns are widely dispersed, however, there is no reason to expect greater means or variances of divergence in fragments from very long introns. In fact, the mean divergence for the small number of intron fragments from introns longer than 4,500 bp is 0.054 (SE = 0.004, n = 9). This is significantly smaller than for the small (&#8804;86 bp) intron class (mean divergence = 0.110, n = 127, Wilcoxon two-sample W = 252, <it>P </it>= 0.001) and marginally significantly lower than for introns of intermediate size (between 87 bp and 4,500 bp: mean divergence = 0.072, n = 89, W = 4494, <it>P </it>= 0.044). The non-binomial standard deviation in divergence is estimated to be 0.0056 for the very long introns, compared with 0.023 for the 38 intermediate-sized ones for which fragments at least 20 bp shorter than the introns were used for estimating divergence (this ensures that both classes represent samples rather than complete sequences). This is the opposite pattern to what is expected with strong clustering of regulatory sequences. Levels of constraint, and thus the density of putatively funtional regulatory elements, therefore appear to be relatively uniform across longer introns.</p>
				<p>A uniform density of regulatory functions is unexpected if these often involve clusters of, for example, transcription factor binding sites. However, it might be expected, for example, if the regulatory functions of introns often involve the formation of complex secondary structures. Evidence suggesting that intron sequence and length affects the secondary structure of precursor messenger RNA (pre-mRNA) is accumulating. If this secondary structure plays a regulatory role, it is likely to be conserved. Several studies have found evidence for epistatic selection on introns to maintain pre-mRNA secondary structure <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>, and there is also evidence for a functional role of RNA secondary structure in splicing <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp> and gene expression <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. For example, Chen and Stephan <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> found that mutations disrupting a hairpin structure in intron 1 of the <it>D. melanogaster Adh </it>gene reduce splicing efficiency and decrease production of the <it>Adh </it>protein. These authors show that compensatory mutations that restore the secondary structure result in a mutant indistinguishable from the wild type in splicing efficiency and protein production. A hairpin structure in the second intron of this gene also shows striking structural conservation across ten species in three sub-genera of <it>Drosophila </it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Our finding that the density of constrained sequences does not appear to be a function of intron length (within the long intron class) suggests that pre-mRNA secondary structure may be a more common mechanism mediating gene regulation than discrete regulatory elements such as intronic transcriptional enhancers.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>Most introns in <it>Drosophila </it>are relatively short, but these short introns make up only a small fraction of total intronic DNA in the genome. We demonstrate that levels of selective constraint are higher with increasing intron length. Thus, while the majority of introns in the <it>Drosophila </it>genome may be evolving under little or no selective constraint, the majority of intronic DNA in the genome is likely to be evolving under considerable constraint. We also find that the density of functionally important elements within longer introns does not appear to depend on their length. This suggests that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated, possibly via the formation of pre-mRNA secondary structures. This pattern contrasts with that found in mammals, where constraint does not appear to be a function of intron length <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> (A Kondrashov, personal communication). An unexpected corollary of our study is the finding of a negative correlation between divergence and GC content in introns. This finding implies that a similar pattern observed for synonymous sites in <it>Drosophila </it>may reflect mutational biases rather than selection for codon usage.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Introns</p>
				</st>
				<p>We combined data from three recent studies of complete introns or several hundred base-pair fragments of longer introns located on the X chromosome of <it>D. melanogaster</it>. Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> compiled previously published data for <it>D. melanogaster </it>and <it>D. simulans </it>sequences for each of 163 introns. We combined these data with introns surveyed in <it>D. melanogaster </it>and <it>D. simulans </it>by Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. All the Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> intron fragments were compared to the DNA sequence of the <it>D. melanogaster </it>genome <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Ten of these intron fragments were removed from the analysis because they contained exonic or 5'/3' untranslated region sequences. The alignments for a further 12 of the Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> fragments were trimmed to remove small quantities of exonic or untranslated region sequences. The final Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> dataset used in the analysis therefore contained 53 intron fragments (details on request to PR Haddrill). To this we added nine more intron fragments surveyed by Haddrill <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. For consistency with Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, we realigned these sequences with the program MCALIGN, using the insertion-deletion frequency model defined for <it>Drosophila </it>intronic DNA <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Divergence estimates per site and the GC content of introns were generated for each alignment (excluding the 6 bp/16 bp at the 5'/3' end of the intron, which include bases that are constrained because they are necessary for correct splicing) using the DnaSP software package (Version 4) <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, which corrects divergence values for multiple hits using the Jukes-Cantor equation <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. The use of divergence as a proxy for constraint is appropriate, because the level of selective constraint in a sequence will directly affect the divergence between two species; highly constrained sequences will show little divergence, whereas sequences under little or no selective constraint will accumulate differences more rapidly. Sites overlapping alignment gaps were excluded from the count of total base-pairs. The total length of each intron was determined using the DNA sequence of the <it>D. melanogaster </it>genome <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. The mean total intron length across the entire dataset was 936.5 bp and the mean length of the fragments of introns analyzed here was 230.2 bp.</p>
				<p>Because we did not analyse the entire length of all of the introns included in this study, we were unable to investigate whether intron lengths vary substantially between <it>D. melanogaster </it>and <it>D. simulans</it>. Previous evidence suggests that intron lengths are unlikely to differ to any great extent between the two species, however, and that transitions between the short and long intron size class are rare <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B20">20</abbr><abbr bid="B54">54</abbr></abbrgrp>.</p>
				<p>Partial moment correlation coefficients and least-squares regression coefficients were calculated by the standard formulae, and their significance assessed by bootstrapping over loci 1,000 times to obtain their resampling distributions <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Coding regions</p>
				</st>
				<p>As a comparison for levels of divergence at intron sites, we used synonymous site divergences from 102 genes compiled by Betancourt and Presgraves <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Single-pass sequenced ESTs from this same study were not included in the analysis. Estimates of synonymous site divergences calculated using the Nei and Gojobori <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> correction were kindly provided by A Betancourt. Divergence estimates for synonymous sites based on <it>D. melanogaster </it>- <it>D. simulans </it>alignments for 35 additional X-linked coding regions were identical, and did not differ significantly from divergence estimates for fourfold degenerate sites (P Andolfatto, unpublished data). Several previous studies have documented a positive relationship between exon length and synonymous site divergences <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr></abbrgrp>. This relationship is in the opposite direction to that which would be expected if there were some (unknown) factor co-varying with gene length and neutral divergence that was responsible for the negative association between intron length and intron divergence. Non-synonymous site divergences from the same 102 genes compiled by Betancourt and Presgraves <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> (kindly provided by A Betancourt) were also used in Figure <figr fid="F2">2</figr> for visual comparison with synonymous and intron sites; as expected, these are smaller than the other values, consistent with strong selection against most amino acid substitutions.</p>
			</sec>
			<sec>
				<st>
					<p>Effects of sex linkage</p>
				</st>
				<p>As our data come from three different sources, we investigated possible biases relating to how and why the data were collected. In particular, the studies of Haddrill <it>et al</it>. <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> surveyed intron fragments from longer introns on the X chromosome, whereas the data of Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> contains mostly short introns from all chromosomes. We note a significant difference between autosomal versus X-linked introns in both levels of divergence (Wilcoxon two-sample W = 13502.5, <it>P </it>= 0.006) and GC content (W = 13211.5, <it>P </it>= 0.005). When comparing within size classes (&#8804;86 bp versus &gt;86 bp), however, levels of divergence are not significantly different between autosomal and X-linked introns, and GC content is significantly different for the short intron class, but not the long intron class. The negative correlation between intron length and divergence holds for autosomal and X-linked introns separately (autosomes, Spearman <it>R</it><sub><it>s </it></sub>= -0.261, <it>P </it>= 0.006; X-linked, Spearman <it>R</it><sub><it>s </it></sub>= -0.403, <it>P </it>&lt; 10<sup>-4</sup>) as does the negative relationship between GC content and divergence (autosomes, Spearman <it>R</it><sub><it>s </it></sub>= -0.281, <it>P </it>= 0.003; X-linked, Spearman <it>R</it><sub><it>s </it></sub>= -0.371, <it>P </it>&lt; 10<sup>-4</sup>). The differences in levels of divergence and GC content between autosomal and X-linked introns, therefore, cannot explain the observed relationships between intron length versus divergence and GC content versus divergence.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data are available with the online version of this paper. <supplr sid="S1">Additional data file 1</supplr> is an Excel file listing all introns analyzed. Additional data files <supplr sid="S2">2</supplr>, <supplr sid="S3">3</supplr> and <supplr sid="S4">4</supplr> conatain alignments of the Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, Haddrill <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> data, respectively. <supplr sid="S5">Additional data file 5</supplr> contains programs written to carry out partial moment correlations, least-squares regressions and bootstrapping procedures and the data used for these analyses.</p>
			<suppl id="S1">
				<title>
					<p>Additional File 1</p>
				</title>
				<caption>
					<p>An Excel file listing all introns analyzed</p>
				</caption>
				<text>
					<p>An Excel file listing all introns analyzed</p>
				</text>
				<file name="gb-2005-6-8-r67-S1.xls">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S2">
				<title>
					<p>Additional File 2</p>
				</title>
				<caption>
					<p>Alignments of the Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> data</p>
				</caption>
				<text>
					<p>Alignments of the Glinka <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp> data</p>
				</text>
				<file name="gb-2005-6-8-r67-S2.zip">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S3">
				<title>
					<p>Additional File 3</p>
				</title>
				<caption>
					<p>Alignments of the Haddrill <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp> data</p>
				</caption>
				<text>
					<p>Alignments of the Haddrill <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp> data</p>
				</text>
				<file name="gb-2005-6-8-r67-S3.zip">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S4">
				<title>
					<p>Additional File 4</p>
				</title>
				<caption>
					<p>Alignments of the Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> data</p>
				</caption>
				<text>
					<p>Alignments of the Halligan <it>et al. </it><abbrgrp><abbr bid="B9">9</abbr></abbrgrp> data</p>
				</text>
				<file name="gb-2005-6-8-r67-S4.zip">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S5">
				<title>
					<p>Additional File 5</p>
				</title>
				<caption>
					<p>Programs written to carry out partial moment correlations, least-squares regressions and bootstrapping procedures and the data used for these analyses</p>
				</caption>
				<text>
					<p>Programs written to carry out partial moment correlations, least-squares regressions and bootstrapping procedures and the data used for these analyses</p>
				</text>
				<file name="gb-2005-6-8-r67-S5.zip">
					<p>Click here for file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank A Betancourt for providing divergence estimates for the Betancourt and Presgraves <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> dataset. We thank D Bachtrog, M Przeworski, K Dyer, F Kondrashov and D Presgraves for comments on the manuscript. This work was funded in part by a Biotechnology and Biological Sciences Research Council Grant (to PA and BC) and an AP Sloan Fellowship in Molecular and Computational Biology to PA. BC is supported by The Royal Society.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Conserved noncoding sequences are reliable guides to regulatory elements.</p>
				</title>
				<aug>
					<au>
						<snm>Hardison</snm>
						<fnm>RC</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2000</pubdate>
				<volume>16</volume>
				<fpage>369</fpage>
				<lpage>372</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(00)02081-3</pubid>
						<pubid idtype="pmpid" link="fulltext">10973062</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>The search for meaning in noncoding DNA.</p>
				</title>
				<aug>
					<au>
						<snm>Clark</snm>
						<fnm>AG</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>1319</fpage>
				<lpage>1320</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.201601</pubid>
						<pubid idtype="pmpid" link="fulltext">11483570</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Bergman</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Kreitman</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>1335</fpage>
				<lpage>1345</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.178701</pubid>
						<pubid idtype="pmpid" link="fulltext">11483574</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Selective constraint in intergenic regions of human and mouse genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Ogurtsov</snm>
						<fnm>AY</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>VA</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>373</fpage>
				<lpage>376</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(01)02344-7</pubid>
						<pubid idtype="pmpid" link="fulltext">11418197</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Numerous potentially functional but non-genic conserved sequences on human chromosome 21.</p>
				</title>
				<aug>
					<au>
						<snm>Dermitzakis</snm>
						<fnm>ET</fnm>
					</au>
					<au>
						<snm>Reymond</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Lyle</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Scamuffa</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Ucla</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Deutsch</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Stevenson</snm>
						<fnm>BJ</fnm>
					</au>
					<au>
						<snm>Flegel</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Bucher</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Jongeneel</snm>
						<fnm>CV</fnm>
					</au>
					<au>
						<snm>Antonarakis</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>420</volume>
				<fpage>578</fpage>
				<lpage>582</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01251</pubid>
						<pubid idtype="pmpid" link="fulltext">12466853</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Unexpected conserved non-coding DNA blocks in mammals.</p>
				</title>
				<aug>
					<au>
						<snm>Gaffney</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Keightley</snm>
						<fnm>PD</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<fpage>332</fpage>
				<lpage>337</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.tig.2004.06.011</pubid>
						<pubid idtype="pmpid" link="fulltext">15262402</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<aug>
					<au>
						<snm>Li</snm>
						<fnm>W-H</fnm>
					</au>
					<au>
						<snm>Graur</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Fundamentals of Molecular Evolution</source>
				<publisher>Sunderland, Massachusetts: Sinauer</publisher>
				<pubdate>1991</pubdate>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Distribution and characterization of regulatory elements in the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Majewski</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ott</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>1827</fpage>
				<lpage>1836</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">187578</pubid>
						<pubid idtype="pmpid" link="fulltext">12466286</pubid>
						<pubid idtype="doi">10.1101/gr.606402</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Patterns of evolutionary constraints in intronic and intergenic DNA of <it>Drosophila</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Halligan</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Eyre-Walker</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Andolfatto</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Keightley</snm>
						<fnm>PD</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>273</fpage>
				<lpage>279</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">327102</pubid>
						<pubid idtype="pmpid" link="fulltext">14762063</pubid>
						<pubid idtype="doi">10.1101/gr.1329204</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Genome sequence of the Brown Norway rat yields insights into mammalian evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Gibbs</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Weinstock</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Metzker</snm>
						<fnm>ML</fnm>
					</au>
					<au>
						<snm>Muzny</snm>
						<fnm>DM</fnm>
					</au>
					<au>
						<snm>Sodergren</snm>
						<fnm>EJ</fnm>
					</au>
					<au>
						<snm>Scherer</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Scott</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Steffen</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Worley</snm>
						<fnm>KC</fnm>
					</au>
					<au>
						<snm>Burch</snm>
						<fnm>PE</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2004</pubdate>
				<volume>428</volume>
				<fpage>493</fpage>
				<lpage>521</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature02426</pubid>
						<pubid idtype="pmpid" link="fulltext">15057822</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs.</p>
				</title>
				<aug>
					<au>
						<snm>Jareborg</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1999</pubdate>
				<volume>9</volume>
				<fpage>815</fpage>
				<lpage>824</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">310816</pubid>
						<pubid idtype="pmpid" link="fulltext">10508839</pubid>
						<pubid idtype="doi">10.1101/gr.9.9.815</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Pattern of selective constraint in <it>C. elegans </it>and <it>C. briggsae </it>genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Genet Res Camb</source>
				<pubdate>1999</pubdate>
				<volume>74</volume>
				<fpage>23</fpage>
				<lpage>30</lpage>
			</bibl>
			<bibl id="B13">
				<title>
					<p>The correlation between intron length and recombination in Drosophila: dynamic equilibrium between mutational and selective forces.</p>
				</title>
				<aug>
					<au>
						<snm>Comeron</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Kreitman</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2000</pubdate>
				<volume>156</volume>
				<fpage>1175</fpage>
				<lpage>1190</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11063693</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Minimal introns are not "junk".</p>
				</title>
				<aug>
					<au>
						<snm>Yu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Kibukawa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Paddock</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Passey</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>GK-S</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>1185</fpage>
				<lpage>1189</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">186636</pubid>
						<pubid idtype="pmpid" link="fulltext">12176926</pubid>
						<pubid idtype="doi">10.1101/gr.224602</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Selective constraints on intron evolution in <it>Drosophila</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Parsch</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2003</pubdate>
				<volume>165</volume>
				<fpage>1843</fpage>
				<lpage>1851</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">14704170</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Intron size and exon evolution in Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Marais</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Nouvellet</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Keightley</snm>
						<fnm>PD</fnm>
					</au>
					<au>
						<snm>Charlesworth</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2005</pubdate>
				<volume>170</volume>
				<fpage>481</fpage>
				<lpage>485</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1534/genetics.104.037333</pubid>
						<pubid idtype="pmpid" link="fulltext">15781704</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Why do genes have introns? Recombination might add a new piece to the puzzle.</p>
				</title>
				<aug>
					<au>
						<snm>Duret</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>172</fpage>
				<lpage>175</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(01)02236-3</pubid>
						<pubid idtype="pmpid" link="fulltext">11275306</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents.</p>
				</title>
				<aug>
					<au>
						<snm>Keightley</snm>
						<fnm>PD</fnm>
					</au>
					<au>
						<snm>Gaffney</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>13402</fpage>
				<lpage>13406</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">263826</pubid>
						<pubid idtype="pmpid" link="fulltext">14597721</pubid>
						<pubid idtype="doi">10.1073/pnas.2233252100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Linkage limits the power of natural selection in <it>Drosophila</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Betancourt</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Presgraves</snm>
						<fnm>DC</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>13616</fpage>
				<lpage>13620</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">129723</pubid>
						<pubid idtype="pmpid" link="fulltext">12370444</pubid>
						<pubid idtype="doi">10.1073/pnas.212277199</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Molecular evolution between <it>Drosophila melanogaster </it>and <it>D. simulans</it>: reduced codon bias, faster rates of amino acid substitution, and larger proteins in <it>D. melanogaster</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Akashi</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1996</pubdate>
				<volume>144</volume>
				<fpage>1297</fpage>
				<lpage>1307</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8913769</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Codon usage bias and base composition of nuclear genes in Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Moriyama</snm>
						<fnm>EN</fnm>
					</au>
					<au>
						<snm>Hartl</snm>
						<fnm>DL</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1993</pubdate>
				<volume>134</volume>
				<fpage>847</fpage>
				<lpage>858</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8349115</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Why the rate of silent codon substitutions is variable within a vertebrate's genome.</p>
				</title>
				<aug>
					<au>
						<snm>Filipski</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>J Theor Biol</source>
				<pubdate>1988</pubdate>
				<volume>134</volume>
				<fpage>159</fpage>
				<lpage>164</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmpid">3244279</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Synonymous codon usage in <it>Drosophila melanogaster</it>: natural selection and translational accuracy.</p>
				</title>
				<aug>
					<au>
						<snm>Akashi</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1994</pubdate>
				<volume>136</volume>
				<fpage>927</fpage>
				<lpage>935</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmpid" link="fulltext">8005445</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Mutation rates differ among regions of the mammalian genome.</p>
				</title>
				<aug>
					<au>
						<snm>Wolfe</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Sharp</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>W-H</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1989</pubdate>
				<volume>337</volume>
				<fpage>283</fpage>
				<lpage>285</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/337283a0</pubid>
						<pubid idtype="pmpid" link="fulltext">2911369</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Evolution of a finite population under gene conversion.</p>
				</title>
				<aug>
					<au>
						<snm>Nagylaki</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1983</pubdate>
				<volume>80</volume>
				<fpage>6278</fpage>
				<lpage>6281</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">394279</pubid>
						<pubid idtype="pmpid" link="fulltext">6578508</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Synonymous substitution rates in enterobacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Eyre-Walker</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Bulmer</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1995</pubdate>
				<volume>140</volume>
				<fpage>1407</fpage>
				<lpage>1412</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">7498779</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Directional mutation pressure, mutator mutations, and dynamics of molecular evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Sueoka</snm>
						<fnm>N</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1993</pubdate>
				<volume>37</volume>
				<fpage>137</fpage>
				<lpage>153</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8411203</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>The problem of counting sites in the estimation of the synonymous and nonsynonymous substitution rates: Implications for the correlation between the synonymous substitution rate and codon usage bias.</p>
				</title>
				<aug>
					<au>
						<snm>Bierne</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Eyre-Walker</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2003</pubdate>
				<volume>165</volume>
				<fpage>1587</fpage>
				<lpage>1597</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">14668405</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Substitution rates in <it>Drosophila </it>nuclear genes: Implications for translational selection.</p>
				</title>
				<aug>
					<au>
						<snm>Dunn</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Bielawski</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>ZH</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2001</pubdate>
				<volume>157</volume>
				<fpage>295</fpage>
				<lpage>305</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11139510</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Deformed expression in the <it>Drosophila </it>central nervous system is controlled by an autoactivated intronic enhancer.</p>
				</title>
				<aug>
					<au>
						<snm>Lou</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Bergson</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>McGinnis</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1995</pubdate>
				<volume>23</volume>
				<fpage>3481</fpage>
				<lpage>3487</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">307227</pubid>
						<pubid idtype="pmpid">7567459</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Regulation of the expression of the sn-glycerol-3-phosphate dehydrogenase gene in <it>Drosophila melanogaster</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Bartoszewski</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>JB</fnm>
					</au>
				</aug>
				<source>Biochem Genet</source>
				<pubdate>1998</pubdate>
				<volume>36</volume>
				<fpage>329</fpage>
				<lpage>350</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1023/A:1018745412966</pubid>
						<pubid idtype="pmpid">9919359</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>The mammalian transcriptome and the function of non-coding DNA sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Spiridonov</snm>
						<fnm>NA</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<fpage>105</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">395773</pubid>
						<pubid idtype="pmpid" link="fulltext">15059247</pubid>
						<pubid idtype="doi">10.1186/gb-2004-5-4-105</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the <it>Drosophila </it>genome.</p>
				</title>
				<aug>
					<au>
						<snm>Berman</snm>
						<fnm>BP</fnm>
					</au>
					<au>
						<snm>Nibu</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Pfeiffer</snm>
						<fnm>BD</fnm>
					</au>
					<au>
						<snm>Tomancak</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Levine</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>MB</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>757</fpage>
				<lpage>762</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">117378</pubid>
						<pubid idtype="pmpid" link="fulltext">11805330</pubid>
						<pubid idtype="doi">10.1073/pnas.231608898</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Tracing the evolutionary history of Drosophila regulatory regions with models that identify transcription factor binding sites.</p>
				</title>
				<aug>
					<au>
						<snm>Dermitzakis</snm>
						<fnm>ET</fnm>
					</au>
					<au>
						<snm>Bergman</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Clark</snm>
						<fnm>AG</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2003</pubdate>
				<volume>20</volume>
				<fpage>703</fpage>
				<lpage>714</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msg077</pubid>
						<pubid idtype="pmpid" link="fulltext">12679540</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>CONREAL: Conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting.</p>
				</title>
				<aug>
					<au>
						<snm>Berezikov</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Guryev</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Plasterk</snm>
						<fnm>RHA</fnm>
					</au>
					<au>
						<snm>Cuppen</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>170</fpage>
				<lpage>178</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">314294</pubid>
						<pubid idtype="pmpid" link="fulltext">14672977</pubid>
						<pubid idtype="doi">10.1101/gr.1642804</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p><it>Drosophila </it>DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, <it>Drosophila melanogaster</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Bergman</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Carlson</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>21</volume>
				<fpage>1747</fpage>
				<lpage>1749</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bti173</pubid>
						<pubid idtype="pmpid" link="fulltext">15572468</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Estimates of linkage disequilibrium and the recombination parameter determined from segregating nucleotide sites in the alcohol dehydrogenase region of <it>Drosophila pseudoobscura</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Schaeffer</snm>
						<fnm>SW</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>EL</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1993</pubdate>
				<volume>135</volume>
				<fpage>541</fpage>
				<lpage>552</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8244013</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Maintenance of pre-mRNA secondary structure by epistatic selection.</p>
				</title>
				<aug>
					<au>
						<snm>Kirby</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Muse</snm>
						<fnm>SV</fnm>
					</au>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1995</pubdate>
				<volume>92</volume>
				<fpage>9047</fpage>
				<lpage>9051</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">40921</pubid>
						<pubid idtype="pmpid" link="fulltext">7568070</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Sequence variation of alcohol dehydrogenase (<it>Adh</it>) paralogs in cactophilic Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Matzkin</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Eanes</snm>
						<fnm>WF</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2003</pubdate>
				<volume>163</volume>
				<fpage>181</fpage>
				<lpage>194</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12586706</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Alternative splicing caused by RNA secondary structure.</p>
				</title>
				<aug>
					<au>
						<snm>Solnick</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1985</pubdate>
				<volume>43</volume>
				<fpage>667</fpage>
				<lpage>676</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0092-8674(85)90239-9</pubid>
						<pubid idtype="pmpid">4075405</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Constraints on intron evolution in the gene encoding the Myosin alkali light chain in Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Leicht</snm>
						<fnm>BG</fnm>
					</au>
					<au>
						<snm>Muse</snm>
						<fnm>SV</fnm>
					</au>
					<au>
						<snm>Hanczyc</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Clark</snm>
						<fnm>AG</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1995</pubdate>
				<volume>139</volume>
				<fpage>299</fpage>
				<lpage>308</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">7535717</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Translation inhibition by an mRNA coding region secondary structure is determined by its proximity to the AUG initiation codon.</p>
				</title>
				<aug>
					<au>
						<snm>Liebhaber</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Cash</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Eshleman</snm>
						<fnm>SS</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1992</pubdate>
				<volume>226</volume>
				<fpage>609</fpage>
				<lpage>621</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0022-2836(92)90619-U</pubid>
						<pubid idtype="pmpid">1507219</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>The relationship between third-codon position nucleotide content, codon bias, mRNA secondary structure and gene expression in the Drosophilid alcohol dehydrogenase genes <it>Adh </it>and <it>Adhr</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Carlini</snm>
						<fnm>DB</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2001</pubdate>
				<volume>159</volume>
				<fpage>623</fpage>
				<lpage>633</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11606539</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Compensatory evolution of a precursor messenger RNA secondary structure in the <it>Drosophila melanogaster Adh </it>gene.</p>
				</title>
				<aug>
					<au>
						<snm>Chen</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>11499</fpage>
				<lpage>11504</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">208787</pubid>
						<pubid idtype="pmpid" link="fulltext">12972637</pubid>
						<pubid idtype="doi">10.1073/pnas.1932834100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>RNA folding in Drosophila shows a distance effect for compensatory fitness interactions.</p>
				</title>
				<aug>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Kirby</snm>
						<fnm>DA</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1993</pubdate>
				<volume>135</volume>
				<fpage>97</fpage>
				<lpage>103</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8224831</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Indel-based evolutionary distance and mouse-human divergence.</p>
				</title>
				<aug>
					<au>
						<snm>Ogurtsov</snm>
						<fnm>AY</fnm>
					</au>
					<au>
						<snm>Sunyaev</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>1610</fpage>
				<lpage>1616</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">509270</pubid>
						<pubid idtype="pmpid" link="fulltext">15289479</pubid>
						<pubid idtype="doi">10.1101/gr.2450504</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>Demography and natural selection have shaped genetic variation in <it>Drosophila melanogaster</it>: A multi-locus approach.</p>
				</title>
				<aug>
					<au>
						<snm>Glinka</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ometto</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Mousset</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>De Lorenzo</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2003</pubdate>
				<volume>165</volume>
				<fpage>1269</fpage>
				<lpage>1278</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">14668381</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>FlyBase: A database of the Drosophila genome</p>
				</title>
				<url>http://www.flybase.org</url>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Multilocus patterns of nucleotide variability and the demographic and selection history of <it>Drosophila melanogaster </it>populations.</p>
				</title>
				<aug>
					<au>
						<snm>Haddrill</snm>
						<fnm>PR</fnm>
					</au>
					<au>
						<snm>Thornton</snm>
						<fnm>KR</fnm>
					</au>
					<au>
						<snm>Charlesworth</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Andolfatto</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2005</pubdate>
				<volume>15</volume>
				<fpage>790</fpage>
				<lpage>799</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.3541005</pubid>
						<pubid idtype="pmpid" link="fulltext">15930491</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Keightley</snm>
						<fnm>PD</fnm>
					</au>
					<au>
						<snm>Johnson</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>442</fpage>
				<lpage>450</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">353231</pubid>
						<pubid idtype="pmpid" link="fulltext">14993209</pubid>
						<pubid idtype="doi">10.1101/gr.1571904</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>MCALIGN for alignment of noncoding DNA</p>
				</title>
				<url>http://homepages.ed.ac.uk/eang33/mcinstructions.html</url>
			</bibl>
			<bibl id="B52">
				<title>
					<p>DnaSP Software</p>
				</title>
				<url>http://www.ub.es/dnasp</url>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Evolution of protein molecules.</p>
				</title>
				<aug>
					<au>
						<snm>Jukes</snm>
						<fnm>TH</fnm>
					</au>
					<au>
						<snm>Cantor</snm>
						<fnm>CR</fnm>
					</au>
				</aug>
				<source>Mammalian Protein Metabolism III</source>
				<publisher>New York: Academic Press</publisher>
				<editor>Munro HN</editor>
				<pubdate>1969</pubdate>
				<fpage>21</fpage>
				<lpage>132</lpage>
			</bibl>
			<bibl id="B54">
				<title>
					<p>Molecular evolution of the Metallothionein gene <it>Mtn </it>in the <it>melanogaster </it>species group: results from <it>Drosophila ananassae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Stephan</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Rodriguez</snm>
						<fnm>VS</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Parsch</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1994</pubdate>
				<volume>138</volume>
				<fpage>135</fpage>
				<lpage>143</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8001781</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B55">
				<aug>
					<au>
						<snm>Sokal</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Rohlf</snm>
						<fnm>FJ</fnm>
					</au>
				</aug>
				<source>Biometry</source>
				<publisher>San Francisco: WH Freeman</publisher>
				<pubdate>1995</pubdate>
			</bibl>
			<bibl id="B56">
				<title>
					<p>Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions.</p>
				</title>
				<aug>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Gojobori</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1986</pubdate>
				<volume>3</volume>
				<fpage>418</fpage>
				<lpage>426</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">3444411</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B57">
				<title>
					<p>Evolution of codon usage bias in <it>Drosophila</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Powell</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Moriyama</snm>
						<fnm>EN</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1997</pubdate>
				<volume>94</volume>
				<fpage>7784</fpage>
				<lpage>7790</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">33704</pubid>
						<pubid idtype="pmpid" link="fulltext">9223264</pubid>
						<pubid idtype="doi">10.1073/pnas.94.15.7784</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Comeron</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Kreitman</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Aguade</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1999</pubdate>
				<volume>151</volume>
				<fpage>239</fpage>
				<lpage>249</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9872963</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>Expression pattern and, surprisingly, gene length shape codon usage in <it>Caenorhabditis</it>, <it>Drosophila </it>and <it>Arabidopsis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Duret</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Mouchiroud</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1999</pubdate>
				<volume>96</volume>
				<fpage>4482</fpage>
				<lpage>4487</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">16358</pubid>
						<pubid idtype="pmpid" link="fulltext">10200288</pubid>
						<pubid idtype="doi">10.1073/pnas.96.8.4482</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
