<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-345</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Highly conserved gene order and numerous novel repetitive elements in genomic regions linked to wing pattern variation in <it>Heliconius </it>butterflies</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Papa</snm>
               <fnm>Riccardo</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>rpapa@uci.edu</email>
            </au>
            <au id="A2">
               <snm>Morrison</snm>
               <mi>M</mi>
               <fnm>Clayton</fnm>
               <insr iid="I4"/>
               <email>cm144293@bcm.tmc.edu</email>
            </au>
            <au id="A3">
               <snm>Walters</snm>
               <mi>R</mi>
               <fnm>James</fnm>
               <insr iid="I5"/>
               <email>jrw47@cornell.edu</email>
            </au>
            <au id="A4">
               <snm>Counterman</snm>
               <mi>A</mi>
               <fnm>Brian</fnm>
               <insr iid="I6"/>
               <email>bacounte@ncsu.edu</email>
            </au>
            <au id="A5">
               <snm>Chen</snm>
               <fnm>Rui</fnm>
               <insr iid="I4"/>
               <insr iid="I7"/>
               <insr iid="I8"/>
               <email>ruichen@bcm.tmc.edu</email>
            </au>
            <au id="A6">
               <snm>Halder</snm>
               <fnm>Georg</fnm>
               <insr iid="I4"/>
               <insr iid="I9"/>
               <email>ghalder@mdanderson.org</email>
            </au>
            <au id="A7">
               <snm>Ferguson</snm>
               <fnm>Laura</fnm>
               <insr iid="I10"/>
               <email>laura.roberts@sthildas-oxford.com</email>
            </au>
            <au id="A8">
               <snm>Chamberlain</snm>
               <fnm>Nicola</fnm>
               <insr iid="I11"/>
               <email>nchamberlain@cgr.harvard.edu</email>
            </au>
            <au id="A9">
               <snm>ffrench-Constant</snm>
               <fnm>Richard</fnm>
               <insr iid="I12"/>
               <email>r.ffrench-constant@exeter.ac.uk</email>
            </au>
            <au id="A10">
               <snm>Kapan</snm>
               <mi>D</mi>
               <fnm>Durrell</fnm>
               <insr iid="I13"/>
               <email>durrell@hawaii.edu</email>
            </au>
            <au id="A11">
               <snm>Jiggins</snm>
               <mi>D</mi>
               <fnm>Chris</fnm>
               <insr iid="I10"/>
               <email>c.jiggins@zoo.cam.ac.uk</email>
            </au>
            <au id="A12">
               <snm>Reed</snm>
               <mi>D</mi>
               <fnm>Robert</fnm>
               <insr iid="I2"/>
               <email>rreed@uci.edu</email>
            </au>
            <au id="A13">
               <snm>McMillan</snm>
               <mi>O</mi>
               <fnm>William</fnm>
               <insr iid="I1"/>
               <insr iid="I6"/>
               <email>womcmill@ncsu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biology, University of Puerto Rico &#8211; Rio Piedras, San Juan, Puerto Rico, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, USA</p>
            </ins>
            <ins id="I3">
               <p>Department of Evolutionary and Functional Biology, University of Parma, Parma, Italy</p>
            </ins>
            <ins id="I4">
               <p>Program in Developmental Biology, Baylor College of Medicine, Houston, USA</p>
            </ins>
            <ins id="I5">
               <p>Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, USA</p>
            </ins>
            <ins id="I6">
               <p>Department of Genetics, North Carolina State University, Raleigh, USA</p>
            </ins>
            <ins id="I7">
               <p>Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, USA</p>
            </ins>
            <ins id="I8">
               <p>Human Genome Sequencing Center, Baylor College of Medicine, Houston, USA</p>
            </ins>
            <ins id="I9">
               <p>Department of Biochemistry and Molecular Biology, University of Texas &#8211; M.D. Anderson Cancer Center, Houston, USA</p>
            </ins>
            <ins id="I10">
               <p>Department of Zoology, University of Cambridge, Cambridge, UK</p>
            </ins>
            <ins id="I11">
               <p>FAS Center for Systems Biology, Harvard University, Cambridge, USA</p>
            </ins>
            <ins id="I12">
               <p>Centre for Ecology and Conservation, University of Exeter, Penryn, UK</p>
            </ins>
            <ins id="I13">
               <p>Center for Conservation Research and Training, University of Hawai'i at Manoa, Honolulu, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>345</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/345</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18647405</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-345</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>18</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>22</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>22</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Papa et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>With over 20 parapatric races differing in their warningly colored wing patterns, the butterfly <it>Heliconius erato </it>provides a fascinating example of an adaptive radiation. Together with matching races of its co-mimic <it>Heliconius melpomene</it>, <it>H. erato </it>also represents a textbook case of M&#252;llerian mimicry, a phenomenon where common warning signals are shared amongst noxious organisms. It is of great interest to identify the specific genes that control the mimetic wing patterns of <it>H. erato </it>and <it>H. melpomene</it>. To this end we have undertaken comparative mapping and targeted genomic sequencing in both species. This paper reports on a comparative analysis of genomic sequences linked to color pattern mimicry genes in <it>Heliconius</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Scoring AFLP polymorphisms in <it>H. erato </it>broods allowed us to survey loci at approximately 362 kb intervals across the genome. With this strategy we were able to identify markers tightly linked to two color pattern genes: <it>D </it>and <it>Cr</it>, which were then used to screen <it>H. erato </it>BAC libraries in order to identify clones for sequencing. Gene density across 600 kb of BAC sequences appeared relatively low, although the number of predicted open reading frames was typical for an insect. We focused analyses on the <it>D- </it>and <it>Cr</it>-linked <it>H. erato </it>BAC sequences and on the <it>Yb</it>-linked <it>H. melpomene </it>BAC sequence. A comparative analysis between homologous regions of <it>H. erato </it>(<it>Cr</it>-linked BAC) and <it>H. melpomene </it>(<it>Yb</it>-linked BAC) revealed high levels of sequence conservation and microsynteny between the two species. We found that repeated elements constitute 26% and 20% of BAC sequences from <it>H. erato </it>and <it>H. melpomene </it>respectively. The majority of these repetitive sequences appear to be novel, as they showed no significant similarity to any other available insect sequences. We also observed signs of fine scale conservation of gene order between <it>Heliconius </it>and the moth <it>Bombyx mori</it>, suggesting that lepidopteran genome architecture may be conserved over very long evolutionary time scales.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Here we have demonstrated the tractability of progressing from a genetic linkage map to genomic sequence data in <it>Heliconius </it>butterflies. We have also shown that fine-scale gene order is highly conserved between distantly related <it>Heliconius </it>species, and also between <it>Heliconius </it>and <it>B. mori</it>. Together, these findings suggest that genome structure in macrolepidoptera might be very conserved, and show that mapping and positional cloning efforts in different lepidopteran species can be reciprocally informative.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Among emerging evolutionary and ecological model organisms, the passion-vine butterfly genus <it>Heliconius </it>(Nymphalidae: Heliconiinae) offers particularly exciting possibilities for integrative research into the genetic and developmental basis of adaptive variation <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The genus, composed of around 40 species with hundreds of geographic variants, couples color pattern divergence with multiple cases of mimicry-related convergent evolution <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The wing color patterns of <it>Heliconius </it>are adaptations that warn potential predators of the butterflies' unpalatability <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and also play an important role in speciation <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Nearly all <it>Heliconius </it>species participate in local M&#252;llerian mimicry associations and, in any one area, the wing color patterns of different aposematic butterfly species converge into a handful (usually six or less) of clearly differentiated mimetic assemblages <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. The color patterns characterizing many of these mimicry rings often change dramatically every few hundred kilometers. This pattern of convergent and divergent evolution in <it>Heliconius </it>is best exemplified by the mimetic relationship between <it>H. erato </it>and <it>H. melpomene</it>. The two species are distantly related within the genus and never hybridize <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>, yet, where they co-occur, local races possess nearly identical wing patterns and have undergone parallel and congruent radiations into over 20 geographic races <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>The multiple radiations of mimetic color patterns, particularly the parallel radiations of <it>H. erato </it>and <it>H. melpomene</it>, provide "natural experiments" for comparative studies into the genetic and developmental basis of adaptive change. In this paper, we describe a simple strategy that integrates growing genomic resources in <it>Heliconius </it>to identify regions of the genome near the loci that modulate wing pattern variation in <it>H. erato</it>. Our strategy relies on the fact that large phenotypic differences within species are caused by a handful of major effect loci <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> and that crosses can be designed that allow researchers to unambiguously follow the segregation of alleles at these loci <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. By scanning through thousands of AFLP polymorphisms in these crosses we can identify markers tightly associated with particular color pattern genes. These markers are then used to probe newly available Bacterial Artificial Chromosome (BAC) libraries and allow us to obtain large sections of genomic sequence around color pattern genes. These targeted genomic sequences provide the first insights into the architecture of the <it>H. erato </it>genome including details on gene density, repeat structure and, with sequence information from homologous regions of the <it>H. melpomene </it>genome, the preservation of fine-scale gene order between the two co-mimics. These data facilitate comparative mapping work on the genetic basis of color pattern variation and convergence in <it>Heliconius</it>, including efforts to positionally clone the color pattern genes themselves. These data also provide some of the first information on patterns of microsynteny in lepidopteran genomes, complementing recent work showing marked patterns of synteny conservation at a macro scale between <it>H. melpomene </it>and the silk moth <it>Bombyx mori </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>We are focusing our research efforts on two major color pattern loci, <it>D </it>and <it>Cr</it>, which underlie much of the observed pattern variation in <it>H. erato</it>. Both genes are unlinked and alleles at the different loci interact to cause phenotypic shifts across large areas of the wing surface, changing the position, size and shape of red/orange/yellow and melanic patches on both the dorsal and ventral surfaces of the forewings and hindwings. Alleles at the <it>D </it>locus primarily act by switching scale color between black (melanin) and red/orange (ommochrome pigments) <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. In contrast, alleles at <it>Cr </it>control the positioning of melanin across both the forewing and hindwing, thereby either exposing or covering underlying white and yellow pattern elements (Figure <figr fid="F1">1</figr>). The two loci strongly interact to control the size, shape, and position of both the forewing band and hindwing bar of many races of <it>H. erato </it><abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Cross design and wing phenotypes</p>
            </caption>
            <text>
               <p><b>Cross design and wing phenotypes</b>. Color pattern phenotypes observed in crosses between 'grand-parental' <it>H. himera </it>(middle) and <it>H. erato cyrbia </it>(left) and <it>H. erato notabilis </it>(right) resulting in two pairs of F1 parents with females on left, males on right. Each pair of F1 parents produced F2s with <it>H. himera </it>&#215; <it>H. erato cyrbia </it>offspring (A) and the <it>H. himera </it>&#215; <it>H. erato notabilis </it>offspring (B). In both F2 families, the observed phenotypic differences among individuals are consistent with the interaction of two co-dominant loci, as described in the Methods.</p>
            </text>
            <graphic file="1471-2164-9-345-1"/>
         </fig>
         <p>Crossing experiments among the various races of <it>H. erato </it>and <it>H. melpomene </it>have shown that the genetic basis of the color pattern radiations is similar in these species <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In both, a small number of major effect loci, or complex of tightly linked loci, modulate much of the intraspecific pattern variation. Furthermore, the phenotypic effects of many of the major patterning genes are often quite similar between the two species <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. For example, <it>Cr </it>in <it>H. erato </it>and the <it>N</it>/<it>Yb</it>/<it>Sb </it>complex in <it>H. melpomene </it>control most of the variation in yellow and white pattern elements in different mimic races of the two species <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B17">17</abbr></abbrgrp>. Similarly, variation in the major red pattern elements on the forewing and hindwing of <it>H. erato </it>and <it>H. melpomene </it>can be explained by variation at an unlinked gene, <it>D</it>, in <it>H. erato </it>and the similarly named <it>D/B </it>complex, in <it>H. melpomene</it>. In contrast, in <it>H. melpomene </it>these switch genes represent clusters of tightly linked elements separated by one cM or less <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B18">18</abbr></abbrgrp>. Comparative mapping experiments have shown that the <it>Yb </it>complex in <it>H. melpomene </it>and the <it>Cr </it>locus in <it>H. erato</it>, which have analogous phenotypic effects, map to the homologous regions of their respective genomes <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
         <p>There were three primary goals for the study presented here. First, we sought to identify molecular markers linked to the <it>H. erato </it>color pattern genes <it>D </it>and <it>Cr</it>. Second, we used some of these molecular markers to identify and sequence BAC clones containing genomic sequences linked to these color pattern genes. Lastly, we analyzed selected BAC sequences in order to better understand fine-scale characteristics of the <it>H. erato </it>genome and to make comparisons with homologous genomic sequences in <it>H. melpomene </it>and <it>B. mori</it>. Ultimately we found that synteny is highly conserved between <it>Heliconius </it>species, and even between <it>Heliconius </it>and <it>B. mori</it>. We also observed relatively low gene density coupled with a high frequency of novel repeat elements in the <it>Heliconius </it>genomic sequences. Together, our data show that comparative genomic analysis between lepidopterans is highly tractable, and that positional cloning of genes underlying color pattern variation in <it>Heliconius </it>should be possible using standard methods.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Identification of markers tightly linked to color pattern genes</p>
            </st>
            <p>We examined 1440 AFLP <it>H. erato </it>polymorphisms using 23 primer combinations (<it>Eco</it>CN/<it>Mse</it>CNN). The number of AFLP bands per gel ranged between 26 and 132 with a mean of 72 bands per primer combination. Of these, approximately 84% were polymorphic in our outbred F2 cross. The experiment-wide error rate for our screen was approximately 1.0%, as inferred from discrepancies among female informative (FI) markers. In total, we scored 490 Male Informative (MI) and 470 backcross informative (BI) loci. Assuming an estimated <it>H. erato </it>genome size of 395 Mb <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, and assuming that AFLP markers are distributed randomly, suggests that we surveyed polymorphisms at approximately 362 kb intervals across the genome. This would suggest a resolution of 1.3 cM assuming that the relationship between physical and recombination distance is 276 kb/cM <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
            <p>Our genome scan identified several AFLP markers 1&#8211;3 cM away from <it>D</it>. For the other gene, <it>Cr</it>, previous work using an identical strategy on crosses of <it>H. melpomene </it>provided markers within one cM of this gene in <it>H. erato </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In total, we identified five AFLP loci within a 3 cM target window around the <it>D </it>locus. Across our <it>H. himera </it>&#215; <it>H. erato notabilis </it>mapping family for <it>D</it>, three loci (MI_EcoCA-MseCAA-114 bp, BI_EcoCC-MseCAG-155 bp, BI_EcoCT-MseCCG-139 bp) were perfectly linked and two loci (MI_EcoCC-MseCAC-527 bp, MI_EcoCC-MseCAC-485) showed only one recombinant. We cloned and sequenced several of these AFLP loci and developed PCR primers to amplify them from genomic DNA. Interestingly, two AFLP bands tightly linked to the <it>D </it>gene, MI_EcoCC-MseCAC-527 bp, MI_EcoCC-MseCAC-485, were allelic variants of the same locus.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of BAC clones containing color pattern-linked AFLPs</p>
            </st>
            <p>We screened the <it>H. erato </it>BAC library with <it>D</it>-linked (MI_EcoCC-MseCAC-527 bp) and <it>Cr</it>-linked ("<it>&#946;ggt-II" </it>&#8211; <it>Rab geranylgeranyl transferase beta subunit, &#946;ggt-II </it>gene <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>) probes. In our screens with these and other probes (15 probes in total, with an average of 12 positives per probe), we consistently observed between 9 and 15 PCR-confirmed positives per probe, suggesting the <it>H. erato </it>BAC libraries have approximately 10&#215; genome coverage. The largest clones identified from the <it>D</it>-linked and <it>Cr</it>-linked probing experiments were sequenced at 8&#215; coverage. The <it>D</it>-linked clone (BBAM-25K4) was composed of two large sequences that could easily be orientated to produce an approximately 180 kb genomic fragment (Figure <figr fid="F2">2</figr>). Similarly, sequence of the <it>Cr</it>-linked clone (BBAM-38A20) was composed of two large sequences that together spanned approximately 165 kb (Figure <figr fid="F3">3</figr>). The probe sequences were clearly identifiable in the <it>D</it>-linked and <it>Cr</it>-linked BACs and linkage to color pattern genes was confirmed by mapping (see below).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p><it>H. erato </it>BAC sequence (25_K04) annotation</p>
               </caption>
               <text>
                  <p><b><it>H. erato </it>BAC sequence (25_K04) annotation</b>. Annotation of the BAC sequence (clone BBAM-25K4, accession number AC216670) tightly linked to the <it>D </it>color pattern gene. Starting from the right: A) 13.5 cM interval Linkage analysis of LG18, with the gene that control the red pigment (D locus); B) fingerprinting of the positive clones obtained by probing the AFLP CC-CAC-491 (dotted bar); C) sequence analysis of the BAC clone 25_K04, where black circles represent hypothetical ORFs greater than 60 amino acids, with the larger circles representing putative ORFs greater than 150 amino acids. Within each bar, the grey areas indicate repetitive sequence with the black regions indicating exon/intron structure of 2 predicted proteins (with arrow indicating direction) showing a high similarity to known proteins in other arthropods. For gene annotations see Table 2.</p>
               </text>
               <graphic file="1471-2164-9-345-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Fine-scale synteny and sequence conservation between <it>H. erato </it>and <it>H. melpomene</it></p>
               </caption>
               <text>
                  <p><b>Fine-scale synteny and sequence conservation between <it>H. erato </it>and <it>H. melpomene</it></b>. Top bar represents approximately 180 kb of sequence covering two BACs (clone AEHM-41C10 accession number CR974474; and clone AEHM-7G12 accession number CT955980) for <it>H. melpomene </it>and the bottom bar represents 180 kb of sequence in two large contigs for <it>H. erato </it>(clone BBAM-38A20, accession numbers AC193804). Black circles below bars represent hypothetical open reading frames (ORFs) greater than 60 amino acids, with the larger circles representing putative ORFs greater than 150 amino acids. Within each bar, the grey areas indicate repetitive sequence with the black regions indicating exon/intron structure of 13 predicted proteins (with arrow above bar indicating direction) showing a high similarity to known proteins in other arthropods. A visual representation of the global alignment between the two genomic sequences and the level of synteny is show at the bottom of the figure. The lines between the two sequences unite regions with high sequence identity (>85% of similarity). For gene annotations see Table 2.</p>
               </text>
               <graphic file="1471-2164-9-345-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Chromosome walk in the <it>Cr </it>region</p>
            </st>
            <p>From the sequence of the first <it>Cr</it>-linked BAC, identified with the <it>&#946;ggt-II </it>gene, we designed additional probes to use for a second round of BAC library screening. Specifically we generated two more probes, corresponding to the genes <it>Trehalase1 </it>and <it>B9</it>, to expand our walk on both directions. With this strategy we identified new BACs on the 3' end that were positive for <it>B9 </it>and negative for <it>Trehalase1 </it>and others on the 5' end positive for <it>&#946;ggt-II </it>and negative for <it>Trehalase1</it>. After fingerprinting we selected one BAC to sequence from each end. Ultimately, BBAM-27D18 extended the overall contig by 211 kb on the 3' end, while BBAM-12K4 extended the contig on the 5' for 55 kb (Figure <figr fid="F4">4</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Gene conservation between <it>H. erato </it>and <it>H. melpomene </it>contigs</p>
               </caption>
               <text>
                  <p><b>Gene conservation between <it>H. erato </it>and <it>H. melpomene </it>contigs</b>. Comparison of homologous genomic regions linked to the <it>Cr </it>color pattern gene in <it>H. erato </it>and <it>Yb </it>color pattern gene in <it>H. melpomene</it>. The top bar represents approximately 280 kb of sequence covering three BACs (clone AEHM-41C10 accession number CR974474, clone AEHM-7G12 accession number CT955980, and clone AEHM-11J7 accession number CU367882) for <it>H. melpomene</it>. The bottom bar represents approximately 350 kb of sequence covering three BACs (BBAM-12K4 accession number AC220074, BBAM-38A20 accession number AC193804, and BBAM-27D18 accession number AC199750). Genes showing a high similarity to know proteins in other arthropods are represented with black squares and arrows indicating orientation. The lines between the two sequences unite the homologous genes giving a visual representation of the overall synteny between the two species.</p>
               </text>
               <graphic file="1471-2164-9-345-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>High frequency of novel repetitive elements in the <it>Heliconius </it>genome</p>
            </st>
            <p>The <it>D- </it>and <it>Cr</it>-linked BAC sequences were AT-rich (65% and 66% AT content, respectively) and contained many repeats (Table <tblr tid="T1">1</tblr>). Although all three major classes of transposable elements (DNA transposons, LTR, and non-LTR retrotransposons) were present in the genomic sequences (Table <tblr tid="T1">1A</tblr>), the vast majority of repetitive sequences showed no significant BLAST similarity (<it>i.e</it>. e-value &lt; 0.001) to any of the insect genomes currently available NCBI databases nor to any arthropod transposable elements listed in RepBase <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. <it>Heliconius</it>-specific repetitive sequences corresponded to the nine core motifs identified with RepeatFinder (Table <tblr tid="T1">1B</tblr>; Additional file <supplr sid="S1">1</supplr>: Novel repetitive elements in <it>Heliconius </it>(row sequences)). Of the nine motifs, six are present in both <it>H. erato </it>and <it>H. melpomene</it>, two are unique to <it>H. erato</it>, and one is unique to <it>H. melpomene </it>(Table <tblr tid="T1">1</tblr>; Additional file <supplr sid="S1">1</supplr>: Novel repetitive elements in <it>Heliconius </it>(row sequences)).</p>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p>Sequences of the <it>Heliconius </it>novel repetitive elements. Core Motif sequences of the nine novel <it>Heliconius </it>repetitive elements identified with RepeatFinder in BAC sequences from <it>H. erato </it>(accession numbers: AC193804, AC216670) and <it>H. melpomene </it>(accession numbers: CR974474, CT955980).</p>
               </text>
               <file name="1471-2164-9-345-S1.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>(A, B) &#8211; <it>Heliconius </it>repetitive elements.</p>
               </caption>
               <tblbdy cols="19">
                  <r>
                     <c cspan="19" ca="left">
                        <p>A.</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Locus name (origin)</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>MARINER_HC</p>
                        <p>(<it>Hyalophora cecropia</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>MARI_BM</p>
                        <p>(<it>Bombyx mori</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>GYPSY70-I_AG</p>
                        <p>(<it>Anopheles gambiae</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Sake_BM</p>
                        <p>(<it>Bombyx mori</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>DOC5_DM</p>
                        <p>(<it>D. melanogaster</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>G4_DM</p>
                        <p>(<it>D. melanogaster</it>)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>ZENON_BM</p>
                        <p>(<it>Bombyx mori</it>)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Element Family</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Mariner</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Mariner</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Gypsy</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Daphne</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Jockey</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Jockey</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>CR1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Element type</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>DNA transposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>DNA transposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>LTR retroposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>LTR retroposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Non-LTR retroposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Non-LTR retroposon</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Non-LTR retroposon</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Described length</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>1255</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>1310</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>4858</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>5140</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>2791</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>3856</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>2599</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="15">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p># regions masked</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Avg length of masked region (+/- Std.Dev.)</p>
                     </c>
                     <c ca="center">
                        <p>85.33 (51.03)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>159.33 (28.11)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2062</p>
                     </c>
                     <c ca="center">
                        <p>253 (173.95)</p>
                     </c>
                     <c ca="center">
                        <p>73</p>
                     </c>
                     <c ca="center">
                        <p>440 (275.77)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1243</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>520.8 (618.74)</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Min length of masked region</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>127</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>245</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>94</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Max length of masked region</p>
                     </c>
                     <c ca="center">
                        <p>143</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>178</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>143</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>635</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1574</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total nucleotide masked</p>
                     </c>
                     <c ca="center">
                        <p>256</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>478</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2062</p>
                     </c>
                     <c ca="center">
                        <p>256</p>
                     </c>
                     <c ca="center">
                        <p>73</p>
                     </c>
                     <c ca="center">
                        <p>880</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1243</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2604</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Proportion of BACs masked</p>
                     </c>
                     <c ca="center">
                        <p>0.07%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>0.14%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1.05%</p>
                     </c>
                     <c ca="center">
                        <p>0.07%</p>
                     </c>
                     <c ca="center">
                        <p>0.04%</p>
                     </c>
                     <c ca="center">
                        <p>0.25%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>0.35%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>0.74%</p>
                     </c>
                     <c ca="center">
                        <p>0.02%</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="15">
                        <hr/>
                     </c>
                     <c cspan="4">
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="19" ca="left">
                        <p>B.</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="19">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Core Motif</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>1</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>2</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>3</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>4</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>5</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>6</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>7</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>8</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                     <c ca="center">
                        <p>era</p>
                     </c>
                     <c ca="center">
                        <p>mel</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="19">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Length of core motif</p>
                     </c>
                     <c ca="center">
                        <p>469</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>460</p>
                     </c>
                     <c ca="center">
                        <p>598</p>
                     </c>
                     <c ca="center">
                        <p>428</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>384</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>314</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>259</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>266</p>
                     </c>
                     <c ca="center">
                        <p>NA*</p>
                     </c>
                     <c ca="center">
                        <p>147</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>306</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p># regions masked</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>310</p>
                     </c>
                     <c ca="center">
                        <p>129</p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean length of masked region (Std.Dev.)</p>
                     </c>
                     <c ca="center">
                        <p>225. (121.71)</p>
                     </c>
                     <c ca="center">
                        <p>70.5 (16.26)</p>
                     </c>
                     <c ca="center">
                        <p>415 (118.3)</p>
                     </c>
                     <c ca="center">
                        <p>406.27 (302.12)</p>
                     </c>
                     <c ca="center">
                        <p>166.69 (127.26)</p>
                     </c>
                     <c ca="center">
                        <p>119 (NA)</p>
                     </c>
                     <c ca="center">
                        <p>341.8 (100.7)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>189.36 (90.64)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>125.59 (92.66)</p>
                     </c>
                     <c ca="center">
                        <p>76.75 (38.87)</p>
                     </c>
                     <c ca="center">
                        <p>155.25 (70.85)</p>
                     </c>
                     <c ca="center">
                        <p>129.95 (64.26)</p>
                     </c>
                     <c ca="center">
                        <p>89.1 (27.99)</p>
                     </c>
                     <c ca="center">
                        <p>80.11 (21.98)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>244.1 (96.74)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Minimum length of masked region</p>
                     </c>
                     <c ca="center">
                        <p>69</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>147</p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>162</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>55</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>40</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>55</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Maximum length of masked region</p>
                     </c>
                     <c ca="center">
                        <p>439</p>
                     </c>
                     <c ca="center">
                        <p>82</p>
                     </c>
                     <c ca="center">
                        <p>470</p>
                     </c>
                     <c ca="center">
                        <p>740</p>
                     </c>
                     <c ca="center">
                        <p>449</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>394</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>314</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>272</p>
                     </c>
                     <c ca="center">
                        <p>135</p>
                     </c>
                     <c ca="center">
                        <p>287</p>
                     </c>
                     <c ca="center">
                        <p>280</p>
                     </c>
                     <c ca="center">
                        <p>148</p>
                     </c>
                     <c ca="center">
                        <p>116</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>314</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total nucleotide masked</p>
                     </c>
                     <c ca="center">
                        <p>3603</p>
                     </c>
                     <c ca="center">
                        <p>141</p>
                     </c>
                     <c ca="center">
                        <p>2905</p>
                     </c>
                     <c ca="center">
                        <p>4469</p>
                     </c>
                     <c ca="center">
                        <p>4834</p>
                     </c>
                     <c ca="center">
                        <p>119</p>
                     </c>
                     <c ca="center">
                        <p>1709</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2083</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2135</p>
                     </c>
                     <c ca="center">
                        <p>307</p>
                     </c>
                     <c ca="center">
                        <p>48126</p>
                     </c>
                     <c ca="center">
                        <p>16763</p>
                     </c>
                     <c ca="center">
                        <p>3564</p>
                     </c>
                     <c ca="center">
                        <p>721</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2441</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Amount of BACs masked</p>
                     </c>
                     <c ca="center">
                        <p>1.0%</p>
                     </c>
                     <c ca="center">
                        <p>0.1%</p>
                     </c>
                     <c ca="center">
                        <p>0.8%</p>
                     </c>
                     <c ca="center">
                        <p>2.3%</p>
                     </c>
                     <c ca="center">
                        <p>1.4%</p>
                     </c>
                     <c ca="center">
                        <p>0.1%</p>
                     </c>
                     <c ca="center">
                        <p>0.5%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>0.6%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>0.6%</p>
                     </c>
                     <c ca="center">
                        <p>0.2%</p>
                     </c>
                     <c ca="center">
                        <p>13.6%</p>
                     </c>
                     <c ca="center">
                        <p>8.5%</p>
                     </c>
                     <c ca="center">
                        <p>1.0%</p>
                     </c>
                     <c ca="center">
                        <p>0.4%</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>1.2%</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Summary of identification and repeat masking of repetitive elements in BAC sequences from <it>H. erato </it>(accession numbers: AC193804, AC216670) and <it>H. melpomene </it>(accession numbers: CR974474, CT955980). A) Repetitive elements identified via similarity to previously described sequences. B) Novel repetitive elements unique to <it>Heliconius </it>identified <it>de novo </it>using the RepeatMasker software (additional file <supplr sid="S1">1</supplr>: Novel repetitive elements in <it>Heliconius </it>(row sequences)). Each <it>core motif </it>is a summary sequence representing a different and unique set of related repeat sequences. Motifs four and five were not observed in <it>H. melpomene</it>; Motif nine was not oberseved in <it>H. erato</it>. Core Motifs 1,2,3,4,5,6,7 and 8 were identified by RepeatFinder only in <it>H. erato </it>but masked by RepeatMasker in <it>H. melpomene</it>. (* RepeatFinder identified fragments of this motif in <it>H. melpomene</it>, but the groups did not overlap to give a core motif comparable to that in <it>H. erato</it>).</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Gene density in <it>Heliconius </it>BAC sequences</p>
            </st>
            <p>Gene density appeared to be relatively low across both the <it>Cr- </it>and the <it>D</it>-linked genomic regions. Although there were a moderate number of predicted open reading frames (ORFs) over 60 amino acids long (Figures <figr fid="F2">2</figr> and <figr fid="F3">3</figr>), few showed any similarity to known proteins or lepidopteran ESTs, including our own collection of nearly 20,000 <it>Heliconius </it>ESTs (Table <tblr tid="T2">2</tblr>). For example, across the <it>D</it>-linked BAC we identified 75 hypothetical proteins using the Kaikogaas annotation tool and our own BLAST analysis. Over 90%, however, were less than 150 amino acids in length, only one of which showed similarity to any known or predicted protein. Across the entire ~190 kb region near the <it>D </it>locus in <it>H. erato </it>there were only two hypothetical proteins that showed significant homology to a known protein or contained a known structural element. One was similar to a sequence in our EST collection (HEC00815), while the other showed strong homology to a lepidopteran methionine-rich larval storage protein (Table <tblr tid="T2">2</tblr>). Although there was a smaller absolute number of predicted proteins across the <it>Cr</it>-linked BAC, there were more than twice as many predicted proteins of amino-acid length greater than 150 across, all but one of which showed strong homology to known proteins or to <it>Heliconius </it>ESTs (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F2">2</figr>). Overall, the gene density we observed in <it>H. erato </it>is similar to what has been observed in the repeat-rich heterochromatin domains of <it>Drosophila melanogaster</it>, which averages 2.9 genes per 100 kb (versus 12.6 genes per 100 kb in euchromatin) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>BAC annotation summary.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Predicted gene</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>BAC Accession number</p>
                     </c>
                     <c ca="center">
                        <p>Organism</p>
                     </c>
                     <c ca="center">
                        <p>Identities</p>
                     </c>
                     <c ca="center">
                        <p>E value</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                     <c cspan="3">
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>H. erato</p>
                     </c>
                     <c ca="center">
                        <p>H. melpomene</p>
                     </c>
                     <c ca="center">
                        <p>B. mori</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Putative reverse transcriptase (RVT)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2529(818322&#8211;834985;7e-13)*</p>
                     </c>
                     <c ca="center">
                        <p>Medicago truncatula</p>
                     </c>
                     <c ca="center">
                        <p>48%</p>
                     </c>
                     <c ca="center">
                        <p>e-140</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>HEC01402 (DT665615)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2795(2232547&#8211;2233374;9e-26)*</p>
                     </c>
                     <c ca="center">
                        <p>Heliconius erato</p>
                     </c>
                     <c ca="center">
                        <p>92%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Putative reverse transcriptase (RVT)</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2136(919260&#8211;923481;1e-10)*</p>
                     </c>
                     <c ca="center">
                        <p>Aedes aegypti</p>
                     </c>
                     <c ca="center">
                        <p>29%</p>
                     </c>
                     <c ca="center">
                        <p>5e-76</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Galactokinase</p>
                     </c>
                     <c ca="center">
                        <p>AC193804/AC220074</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>2829(2954764&#8211;2958794;5e-43)*</p>
                     </c>
                     <c ca="center">
                        <p>Xenopus laevis</p>
                     </c>
                     <c ca="center">
                        <p>42%</p>
                     </c>
                     <c ca="center">
                        <p>1e-61</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Rab geranylgeranyl transferase b subunit (&#946;ggt-II)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804/AC220074</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2829(2933018&#8211;2938173;e-129)*</p>
                     </c>
                     <c ca="center">
                        <p>Danio rerio</p>
                     </c>
                     <c ca="center">
                        <p>68%</p>
                     </c>
                     <c ca="center">
                        <p>1e-98</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Glucose dehydrogenase (GDeh)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804/AC220074</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2829(2929161&#8211;2931568;e-111)*</p>
                     </c>
                     <c ca="center">
                        <p>Aedes aegypti</p>
                     </c>
                     <c ca="center">
                        <p>36%</p>
                     </c>
                     <c ca="center">
                        <p>3e-91</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Forkhead box J1 (F-head)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804/AC220074</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2829(2921212&#8211;2923967;e-123)*</p>
                     </c>
                     <c ca="center">
                        <p>Tribolium castaneum</p>
                     </c>
                     <c ca="center">
                        <p>43%</p>
                     </c>
                     <c ca="center">
                        <p>3e-57</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acylamino-acid-releasing enzyme (AARE)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804/AC220074</p>
                     </c>
                     <c ca="center">
                        <p>CR974474</p>
                     </c>
                     <c ca="center">
                        <p>2829(2895678&#8211;2909890;e-125)*</p>
                     </c>
                     <c ca="center">
                        <p>Tribolium castaneum</p>
                     </c>
                     <c ca="center">
                        <p>39%</p>
                     </c>
                     <c ca="center">
                        <p>2e-84</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Putative reverse transcriptase (RVT)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>3058(6735155&#8211;6740613;e-100)*</p>
                     </c>
                     <c ca="center">
                        <p>Drosophila simulans</p>
                     </c>
                     <c ca="center">
                        <p>42%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acylamino-acid-releasing enzyme (AARE)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804</p>
                     </c>
                     <c ca="center">
                        <p>CR974474/CT955980</p>
                     </c>
                     <c ca="center">
                        <p>2829(2910836&#8211;2912522;3e-65)*</p>
                     </c>
                     <c ca="center">
                        <p>Tribolium castaneum</p>
                     </c>
                     <c ca="center">
                        <p>50%</p>
                     </c>
                     <c ca="center">
                        <p>2e-48</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Trehalase (Treh1)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804</p>
                     </c>
                     <c ca="center">
                        <p>CT955980</p>
                     </c>
                     <c ca="center">
                        <p>2829(2866386&#8211;2868122;0.0)*</p>
                     </c>
                     <c ca="center">
                        <p>Bombyx mori</p>
                     </c>
                     <c ca="center">
                        <p>50%</p>
                     </c>
                     <c ca="center">
                        <p>e-158</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Trehalase (Treh2)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804</p>
                     </c>
                     <c ca="center">
                        <p>CT955980</p>
                     </c>
                     <c ca="center">
                        <p>2829(2861182&#8211;2862921;0.0)*</p>
                     </c>
                     <c ca="center">
                        <p>Bombyx mori</p>
                     </c>
                     <c ca="center">
                        <p>60%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>HEC03006 (DT668569)</p>
                     </c>
                     <c ca="center">
                        <p>AC193804</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>3058(4660174&#8211;4667549;2e-08)</p>
                     </c>
                     <c ca="center">
                        <p>Heliconius erato</p>
                     </c>
                     <c ca="center">
                        <p>96%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Methionine-rich storage protein 2 (MRSP)</p>
                     </c>
                     <c ca="center">
                        <p>AC216670</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>3026(2396448&#8211;2401216;0.0)*</p>
                     </c>
                     <c ca="center">
                        <p>Manduca sexta</p>
                     </c>
                     <c ca="center">
                        <p>66%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>HEC00815 (DT661873)</p>
                     </c>
                     <c ca="center">
                        <p>AC216670</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>NA</p>
                     </c>
                     <c ca="center">
                        <p>Heliconius erato</p>
                     </c>
                     <c ca="center">
                        <p>96%</p>
                     </c>
                     <c ca="center">
                        <p>0.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B9</p>
                     </c>
                     <c ca="center">
                        <p>AC199750</p>
                     </c>
                     <c ca="center">
                        <p>CT955980</p>
                     </c>
                     <c ca="center">
                        <p>2829(2830788&#8211;2833491;3e-79)*</p>
                     </c>
                     <c ca="center">
                        <p>Danio rerio</p>
                     </c>
                     <c ca="center">
                        <p>39%</p>
                     </c>
                     <c ca="center">
                        <p>3e-33</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Unkempt (Unk)</p>
                     </c>
                     <c ca="center">
                        <p>AC199750</p>
                     </c>
                     <c ca="center">
                        <p>CU367882</p>
                     </c>
                     <c ca="center">
                        <p>2829(2748736&#8211;2755694;e-110)*</p>
                     </c>
                     <c ca="center">
                        <p>Aedes aegypti</p>
                     </c>
                     <c ca="center">
                        <p>76%</p>
                     </c>
                     <c ca="center">
                        <p>e-102</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Putative DNA helicase recQ (Helic)</p>
                     </c>
                     <c ca="center">
                        <p>AC199750</p>
                     </c>
                     <c ca="center">
                        <p>CU367882</p>
                     </c>
                     <c ca="center">
                        <p>2829(2870771&#8211;2872660;0.0)*</p>
                     </c>
                     <c ca="center">
                        <p>Apis mellifera</p>
                     </c>
                     <c ca="center">
                        <p>51%</p>
                     </c>
                     <c ca="center">
                        <p>e-173</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Beta fructosidase FruA (&#946;fruct)</p>
                     </c>
                     <c ca="center">
                        <p>AC199750</p>
                     </c>
                     <c ca="center">
                        <p>CU367882</p>
                     </c>
                     <c ca="center">
                        <p>2829(2704806&#8211;2706326;e-176)*</p>
                     </c>
                     <c ca="center">
                        <p>Apis mellifera</p>
                     </c>
                     <c ca="center">
                        <p>36%</p>
                     </c>
                     <c ca="center">
                        <p>7e-74</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Annotation summary for four <it>H. erato </it>BACs (accession numbers: AC193804, AC216670, 220074, 199750). Predicted gene function and BLASTp data (species hit, % identity, and e-value) are reported, as are accession numbers of the <it>H. erato, H. melpomene</it>, and <it>B. mori </it>genomic sequences containing the predicted genes. (* scaffold ID, position and e value as seen in the silkworm genome database: SilkDB)</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Fine-scale microsynteny between <it>H. erato </it>and <it>H. melpomene </it>genomic sequences</p>
            </st>
            <p>VISTA analysis <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> (70% identity, 30 bp window &#8211; see Methods) showed a 35% conservation between the 164 kb <it>H. erato Cr</it>-linked BAC clone (BBAM-38A20) and the homologous sequence from <it>H. melpomene </it>(Figure <figr fid="F3">3</figr>). All of the predicted genes showed strong similarity (80&#8211;95% identity) between the two species and perfect overall synteny (Figure <figr fid="F3">3</figr> and <figr fid="F4">4</figr>). Furthermore, a significant portion of 57 kb of presumed non-coding sequence (<it>i.e</it>. did not show notable open reading frames) was highly conserved between the two species (Figure <figr fid="F3">3</figr>). Despite the overall conservation between <it>H. erato </it>and <it>H. melpomene </it>sequences, two ESTs did show a difference between the species. Firstly, HEC03006 was found only in <it>H. erato</it>, and corresponded to a large indel sequence. Also, HEC01402 was found in the <it>H. melpomene </it>BAC sequence but not the <it>H. erato </it>sequence, although it shared some similarity with an exon of the <it>Forkhead </it>gene in the <it>H. erato </it>BAC sequence. For HEC01402, it is likely that the <it>H. erato </it>BAC sequence did not extend far enough to cover the homologus genomic region containing the gene in <it>H. melpomene</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Conservation of gene order between <it>H. erato </it>and <it>B. mori</it></p>
            </st>
            <p>We found evidence for fine-scale synteny between <it>H. erato </it>and <it>B. mori </it>in the 420 kb genomic region linked to the <it>Cr </it>color pattern gene (Figure <figr fid="F5">5</figr>). No evidence for microsynteny was observed between <it>D</it>-linked <it>H. erato </it>genomic regions and <it>B. mori</it>, due to the lack of conserved genes in the <it>D</it>-linked clone. <it>B. mori </it>scaffold sequence (nscaf2829), downloaded from SilkDB <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, contained all of the major genes annotated on the <it>Cr</it>-linked BAC clones (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F5">5</figr>). All genes were unambiguously identified: <it>&#946;ggt-II </it>(nscaf2829, position 2933018&#8211;2938173); <it>Glucose dehydrogenase </it>(nscaf2829, position 2929161&#8211;2931568); <it>Forkhead Box </it>(nscaf2829, position 2921212&#8211;2923967); <it>Trehalase1 </it>(nscaf2829, position 2866386&#8211;2868122); <it>Trehalase2 </it>(nscaf2829, position 2861182&#8211;2862921); <it>B9 </it>(nscaf2829, position 2830788&#8211;2833491); <it>Unkempt </it>(nscaf2829, position 2748736&#8211;2755694); <it>Beta fructosidase FruA </it>(nscaf2829, position 2704806&#8211;2706326). With the exception of the <it>DNA helicase </it>(nscaf2829, position 2870771&#8211;2872660), which appears to have been translocated, all of the gene orders and distances in <it>Heliconius </it>and <it>B. mori </it>are highly conserved. For example, <it>Glucose dehydrogenase </it>and <it>Forkhead Box </it>were separated by 5300 bp in <it>H. erato </it>and 5193 bp in <it>B. mori</it>, while <it>Trehalase1 </it>and <it>Trehalase2 </it>were separated by 3110 bp in <it>H. erato </it>and 3464 bp in <it>B. mori</it>. All seven genes showed 70&#8211;85% nucleotide acid sequence similarity between species (Figure <figr fid="F5">5</figr>). In addition to the major genes, there were many other genomic regions between <it>Heliconius </it>and <it>B. mori </it>with a nucleotide acid sequence similarity higher than 85% that did not show BLAST similarities to any known proteins.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Conservation of gene order and distances between <it>H. erato </it>and <it>B. mori</it></p>
               </caption>
               <text>
                  <p><b>Conservation of gene order and distances between <it>H. erato </it>and <it>B. mori</it></b>. VISTA analysis shows sequence conservation between coding regions in the <it>Cr</it>-linked <it>H. erato </it>BAC clones (BBAM-38A20, accession number AC193804; BBAM-27D18, accession number AC199750 and BBAM-12K4, accession number AC220074) and the <it>B. mori </it>scaffold sequence 2829 (position: 2706326&#8211;2993165). Genes showing a high similarity to know proteins in other arthropods are represented with black squares and arrows indicating orientation (when orientation is different arrows are displayed for both species). The lines between the two sequences unite the homologous genes giving a visual representation of the overall synteny between the two species.</p>
               </text>
               <graphic file="1471-2164-9-345-5"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>High levels of fine-scale genomic conservation between <it>Heliconius </it>species</p>
            </st>
            <p>We have previously demonstrated that the <it>Cr </it>locus in <it>H. erato </it>and the <it>Yb </it>gene of <it>H. melpomene </it>map to homologous areas of the genome <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. BAC genome sequence data for a region tightly linked to the <it>Yb </it>gene was obtained from <it>H. melpomene </it>using methodology similar to that described here. A single gene marker developed from the <it>H. melpomene </it>sequence mapped close to the <it>Cr </it>locus in <it>H. erato</it>. Here we provide the first genomic sequence evidence that, across a broad region around this gene, gene order and gene content is conserved. Across a 420 kb overlapping region all putative proteins showing high similarity to known proteins were in the same order in the two co-mimics (Figure <figr fid="F3">3</figr> and <figr fid="F4">4</figr>). This further supports the hypothesis that a homologous gene, or set of genes, is responsible for color pattern variation in the two species. Indeed, with the exception of two large ORFs with strong sequence similarity to a reverse transcriptase (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F3">3</figr>), gene order in the Cr-linked <it>H. erato </it>region and the <it>N</it>/<it>Yb</it>/<it>Sb</it>-linked <it>H. melpomene </it>region was nearly perfectly preserved. Furthermore, many of the smaller ORFs, as well as some non-coding sequence, were highly conserved both in the relative order and sequence. Generally, the <it>H. erato </it>and <it>H. melpomene </it>genomes appear to be structurally very similar. Because of this, linkage analyses and positional cloning efforts in each individual species should be highly informative for the co-mimetic species, and probably for the genus as a whole.</p>
         </sec>
         <sec>
            <st>
               <p>The difference in <it>H. erato </it>and <it>H. melpomene </it>genome sizes</p>
            </st>
            <p>One of the most obvious differences between the <it>H. erato </it>and <it>H. melpomene </it>genomic sequences was the larger physical distance between homologous anchor points in <it>H. erato </it>relative to <it>H. melpomene</it>. This observation was probably related to the fact that the genome of <it>H. erato </it>is about 30% larger than that of <it>H. melpomene </it><abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. In this respect, it was notable that the difference in genome sizes between the two species was roughly proportional to the size of a number of sequence blocks that are absent in <it>H. melpomene </it>relative to <it>H. erato </it>in our genomic sequences (Figure <figr fid="F3">3</figr>). Many of these blocks were comprised of <it>Heliconius</it>-specific repetitive sequences, or showed strong similarity to known mobile genetic elements. These indel blocks appeared to be primarily noncoding sequences because none of them contained protein-coding sequences, with the exception of one EST (HEC03006) and a <it>reverse transcriptase</it>.</p>
            <p>Differences in the genome size between closely related species is common and is usually associated with differences in abundance of different classes of noncoding DNA <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. It has been suggested that the insertion and replication of repetitive elements, as well as indel biases, can lead to profound differences in genome size <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>. Our observations suggest that both of these effects might be relevant in <it>Heliconius</it>, however, the differences in amount of repetitive DNA sequences in indels, at least over the small region that we examined, were not large enough to completely account for the differences in the genome size of the two species.</p>
         </sec>
         <sec>
            <st>
               <p>Novel repetitive elements in <it>Heliconius</it></p>
            </st>
            <p>Using RepeatFinder <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> and RepeatMasker <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, we identified 16 different repeated elements in the <it>Heliconius </it>genomic sequences (Table <tblr tid="T1">1</tblr>). In total, repetitive sequences accounted for about 20% of the <it>H. melpomene </it>and about 26% of the <it>H. erato </it>genomic sequence. Seven of these repeated elements corresponded to previously described sequences (Table <tblr tid="T1">1A</tblr>). Because we were unable to detect any of the remaining nine repeats (all identified via RepeatFinder) in public sequence databases we assume these nine repeats represent novel repetitive elements unique to the <it>Heliconius </it>genus (Table <tblr tid="T1">1B</tblr>, Additional file <supplr sid="S1">1</supplr>: Novel repetitive elements in <it>Heliconius </it>(row sequences)). The seven previously described elements were larger (~1&#8211;5 kb) relative to the nine novel elements (100&#8211;600 bp) and occurred much less frequently. Most instances of these novel repeats observed in the BAC sequences were intact, full-length, highly similar versions of the core motifs. However, a wide range of fragmentation and divergence relative to the core motifs was also observed among repetitive regions. Motif #7 in <it>H. erato </it>best exemplifies this pattern, as it was the most abundant repeat observed, with 310 BAC regions showing significant similarity to the 266 bp core motif. These regions ranged continuously in size from 37 to 286 bp and in divergence from 2% to 32%. All other motifs showed a qualitatively similar pattern where BAC regions corresponding to a novel element ranged from being highly similar to the core motif sequence to being fragmented and divergent. This pattern is consistent with a process of motif replication and insertion followed by mutational degradation, suggesting these novel repeat motifs likely represent some sort of transposable element such as short interspersed nuclear elements (SINEs) or miniature inverted-repeat terminal elements (MITEs) <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. Both SINEs and MITEs have been reported from Lepidoptera <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. More work is required to confidently determine the origin of these novel repetitive sequences in <it>Heliconius</it>. We are hopeful that future genomic sequences from other butterfly species will allow a better understanding the phylogenetic distribution and evolutionary origins of some of the observed repeats.</p>
         </sec>
         <sec>
            <st>
               <p>Preliminary evidence for fine-scale synteny between <it>Heliconius </it>and <it>Bombyx</it></p>
            </st>
            <p>The observed fine scale synteny within <it>Heliconius </it>complements a recent study showing that patterns of macrosynteny are strongly conserved across the Macrolepidoptera <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. This previous study demonstrated that a total of 70 markers mapped in <it>H. melpomene </it>and <it>B. mori </it>showed large-scale patterns of synteny across the genome (taking into account a number of putative chromosomal fusions that explain the difference in chromosome number between these species). Our data further suggests that synteny has been preserved between <it>Heliconius </it>and <it>B. mori </it>on a much finer scale. Specifically, we show here that seven predicted genes in the <it>Cr </it>region have a similar order in the homologous <it>B. mori </it>genomic sequence (Figure <figr fid="F5">5</figr>). Although this is a very small sampling of the genome as a whole, it is still notable that gene order, as well as intergenic distances, has been preserved over such a long time scale.</p>
         </sec>
         <sec>
            <st>
               <p>Chromosome walking towards <it>Heliconius </it>color pattern genes</p>
            </st>
            <p>The goal of this study was to identify and characterize regions of the genome linked to wing pattern polymorphism in <it>Heliconius </it>butterflies. We did not necessarily expect these initial BAC sequences to contain the color pattern genes themselves, however, these sequences provide important genomic "anchors" for ongoing positional cloning work. Fine-scale mapping experiments imply that we are very near the <it>D </it>and <it>Cr </it>color pattern loci. A microsatellite marker at the 5' end of the <it>D</it>-linked BAC showed 7 recombinants across 444 individuals, suggesting that this end is about 1.9 cM from the gene. There were two fewer recombinants for a marker developed from exon sequence of the <it>Methionine Rich Storage Protein </it>(<it>MRSP</it>) gene at the 3' end of the BAC. This marker is about 150 kb from our 5' microsatellite marker, suggesting that <it>D </it>is a further 1 cM in the 3' direction. We appear to have come down even closer to the <it>Cr </it>gene. In this case, a marker developed from the <it>&#946;ggt-II </it>gene near the 5' of the BBAM-38A20 BAC clone showed no recombinants across 430 individuals. A similar result was obtained with a marker developed from the <it>Putative DNA helicase Rec-Q </it>in the middle of the BBAM-27D18 BAC clone, suggesting that the zero recombinant interval might be somewhat large. For this reason, given an expected relationship of physical to recombination distance of 275 kb/cM <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, an expectation consistent with our initial mapping results, we are optimistic to identify the genomic recombinant interval containing both genes with the next one (<it>Cr</it>) to three (<it>D</it>) BAC steps.</p>
            <p>In terms of identifying color pattern genes, there are obvious practical benefits of a fine-scale preservation of gene order between species. Foremost, conservation should greatly facilitate the identification of functional genes and the discovery of some regulatory elements associated with these genes <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. Indeed, there were numerous conserved regions that were not simply protein coding regions (Figure <figr fid="F3">3</figr>). Even though this analysis covers only a small portion of the <it>Heliconius </it>genome, it confirms, for the first time, at a fine scale level what has been seen in comparative mapping projects concerning gene order conservation between different species in the genus <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B33">33</abbr></abbrgrp>.</p>
            <p>A comparative approach will be particularly important for pinpointing the regions responsible for pattern variation in <it>Heliconius</it>. Pattern formation in <it>Heliconius </it>probably involves discrete changes in conserved protein coding or regulatory regions <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B12">12</abbr></abbrgrp>. There is little precedent for what to expect, however, variation in pattern formation could be controlled by a number of <it>cis</it>-regulatory elements of a single gene, clusters of duplicated genes with divergent function, or clusters of non-paralogous but functionally-related genes.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The mimetic wing patterns of <it>Heliconius </it>stand out as one of the best examples of an adaptive radiation. We are using a strategy that couples growing genomic resources with high-resolution linkage analysis in order to gain a fuller appreciation of the genetic basis of this radiation. We have identified regions of the <it>Heliconius </it>genome tightly linked to genes that modulate pattern variation and, for one of these regions, we have demonstrated the fine-scale preservation of gene order between distantly-related <it>Heliconius </it>species and across Lepidoptera (<it>Heliconius </it>and <it>B. mori</it>). This conservation is significant because it will greatly facilitate efforts for gene identification through parallel and complementary efforts in different species. It is our hope that further fine-scale mapping, complemented by targeted genomic sequencing, will allow us to identify the genes that underlie wing pattern variation and diversity in the genus <it>Heliconius</it>.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Cross strategy</p>
            </st>
            <p>We generated large F2 mapping families by crossing two different geographic races of <it>H. erato </it>to the same stock of <it>H. himera </it>(Figure <figr fid="F1">1</figr>). We followed segregating variation at both the <it>D </it>and the <it>Cr </it>loci in crosses between <it>H. erato cyrbia </it>and <it>H. himera </it>and the <it>D </it>and <it>Sd </it>loci using crosses between <it>H. erato notabilis </it>and <it>H. himera </it>(Figure <figr fid="F1">1</figr>). All crosses were carried out in the <it>Heliconius </it>insectary at the University of Puerto Rico from stocks originally gathered from the wild: <it>H. himera </it>was collected in Vilcabamba, Ecuador (79.13 W, 4.6 S), <it>H. erato notabilis </it>was collected from an area near Puyo, Ecuador (78.0 W, 1.5 S), and <it>H. erato cyrbia </it>was collected near Guayquichuma Glen, Ecuador (79.6 W, 3.9 S). After eclosion butterflies were euthanized, their wings removed for later morphological analysis, and their bodies frozen at -80 &#176;C for later molecular analysis.</p>
         </sec>
         <sec>
            <st>
               <p>Identifying AFLP markers linked to color pattern genes</p>
            </st>
            <p>Genomic DNA was extracted following procedures described in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and approximately 500 ng of DNA was used as template of our AFLP reactions. Genomic DNA restriction digestions and the ligation of oligonucleotide adapters were performed using the Core Reagent Kit (Invitrogen Life Technologies) following manufacture's instructions. For all the other steps of the AFLP analysis (e.g. serial dilutions and PCR amplification protocols), we followed the modifications of the original protocol described by Vos et al. <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> as outlined in Papa et al. <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
            <p>To efficiently identify primer combinations that contained loci tightly linked to color pattern genes, we used a modification of the "bulk segregant analysis" method <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Specifically, we screened 23 AFLP primer combinations across an initial panel of 48 individuals arranged by color pattern genotype. Each reaction was run on a 10% polyacrylamide gel and visualized using fluorescently labeled (IRDye 700 or 800) EcoCN primers on a NEN<sup>&#174; </sup>Global Edition IR2 DNA Analyzer (LI-COR<sup>&#174; </sup>Biosciences, Lincoln, NE). In our screening gels, individuals with distinctive color-pattern genotypes were grouped and these groups were run side-by-side. For example, in the <it>H. erato notabilis </it>&#215; <it>H. himera </it>cross we structured the gel with the following four genotypic groups: 1) <it>D</it><sup><it>him</it></sup><it>D</it><sup><it>not</it></sup><it>, Sd</it><sup><it>him</it></sup><it>Sd</it><sup><it>him</it></sup>; 2) <it>D</it><sup><it>him</it></sup><it>D</it><sup><it>not</it></sup><it>, Sd</it><sup><it>not </it></sup>Sd<sup><it>not</it></sup>; 3) <it>D</it><sup><it>him</it></sup><it>D</it><sup><it>him</it></sup><it>, Sd</it><sup><it>him</it></sup><it>Sd</it><sup><it>not</it></sup>; 4)<it>D</it><sup><it>not</it></sup><it>D</it><sup><it>not</it></sup><it>, Sd</it><sup><it>him</it></sup><it>Sd</it><sup><it>not </it></sup>(Figure <figr fid="F1">1</figr>). In this way, we easily identified marker loci linked to the color pattern genes and those primer combinations showing tightly linked markers were further assayed on all 120 individuals in the brood.</p>
            <p>Because of the nature of our outbred F2 cross design <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp> we observed AFLP loci in all possible phases and segregation patterns. For each AFLP primer combination, we scored four different AFLP marker types: (1) monomorphic, (2) female-informative (FI) (AFLP band present in the mother and absent in the father), (3) male-informative markers (MI) (AFLP band present in the father but absent in the mother), and (4) markers that were present in both parents but segregated in offspring (BI). Both MI and FI markers were expected to segregate in a 1:1 ratio, whereas the BI markers, which were heterozygous in the parents, segregate in a 3:1 ratio. There is no crossing over during the oogenesis in Lepidoptera <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp> and FI markers on the same chromosome are inherited as a linkage block <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Thus, only the MI and BI provided information on recombination distance between AFLP polymorphisms and color pattern genes. Nonetheless, the FI markers were extremely useful because they allowed us to unambiguously identify chromosomal linkage groups and estimate background error rate of the AFLP technique <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>.</p>
            <p>Genotypes for the color pattern genes were inferred from wing pattern phenotypes. Based on the amount of red, white and yellow in the forewing and hindwing, nine genotypes were observed among both groups of F2 progeny. For the <it>D </it>gene, shared by all three grand-parental types: homozygotes for the <it>H. himera </it>allele (<it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>hi</it></sup>) show only red present on the hind-wing bar, when <it>D </it>homozygotes for <it>H. erato cyrbia </it>or <it>H. erato notabilis </it>alleles (<it>D</it><sup><it>cyr </it></sup><it>D</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>not</it></sup><it>D</it><sup><it>not</it></sup>) have red scales on the forewings, while <it>D </it>is heterozygotes between <it>H. himera </it>and the two <it>H. erato </it>races (<it>D</it><sup><it>hi </it></sup><it>D</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>hi </it></sup><it>D</it><sup><it>not</it></sup>) have red pigmented scales on both forewings and hindwings. The <it>Cr </it>gene segregates only in the <it>H. erato cyrbia </it>&#215; <it>H. himera </it>cross, and has an epistatic effect on the <it>D </it>gene by modulating the amount of red. Homozygotes with <it>H. himera </it>alleles (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>hi</it></sup>) show a typical yellow forewing band when <it>D </it>is pure <it>H. himera </it>(<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>hi</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>hi</it></sup>) and a red/white forewing band (total amount of red pigments &lt; 50%) when <it>D </it>is either homozygous for <it>H. erato cyrbia </it>alleles (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>hi</it></sup>; <it>D</it><sup><it>cyr</it></sup><it>D</it><sup><it>cyr</it></sup>) or heterozygous (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>hi</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>cyr</it></sup>). When the <it>Cr </it>color pattern gene is heterozygous (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>cyr</it></sup>) a yellow forewing shadow band is present when the <it>D </it>gene is homozyous for <it>H. himera </it>alleles (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>hi</it></sup>), and an almost all red forewing band (red >50%) when <it>D </it>homozygote for <it>H. erato cyrbia </it>alleles (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>cyr</it></sup><it>D</it><sup><it>cyr</it></sup>) or heterozygote (<it>Cr</it><sup><it>hi</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>cyr</it></sup>). A white trailing dorsal edge and a yellow bar on the ventral hindwing identify the pure <it>H. erato cyrbia </it>form (<it>Cr</it><sup><it>cyr</it></sup><it>Cr</it><sup><it>cyr</it></sup>) with a totally black forewing when <it>D </it>is homozygous for <it>H. himera </it>alleles (<it>Cr</it><sup><it>cyr</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>hi</it></sup>), or totally red when the <it>D </it>gene is pure for <it>H. erato cyrbia </it>alleles (<it>Cr</it><sup><it>cyr</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>cyr</it></sup><it>D</it><sup><it>cyr</it></sup>) or heterozygous (<it>Cr</it><sup><it>cyr</it></sup><it>Cr</it><sup><it>cyr</it></sup>; <it>D</it><sup><it>hi</it></sup><it>D</it><sup><it>cyr</it></sup>).</p>
            <p>To score color patterns, one must also understand activity of the <it>Sd </it>gene. <it>Sd </it>segregates in the <it>H. erato notabilis </it>&#215; <it>H. himera </it>cross, where it controls the shape of the melanic window in the middle forewing of <it>H. himera </it>and shows a clear interaction with the distal forewing patch of <it>H. erato notabili</it>s. In the pure <it>H. himera </it>(<it>Sd</it><sup><it>hi</it></sup><it>Sd</it><sup><it>hi</it></sup>) a typical forewing band is shown, while in the pure <it>H. erato notabilis </it>(<it>Sd</it><sup><it>not</it></sup><it>Sd</it><sup><it>not</it></sup>) a shortened basal forewing patch, as well as a smaller distal forewing patch are evident. The heterozygotes (<it>Sd</it><sup><it>hi</it></sup><it>Sd</it><sup><it>not</it></sup>) present an intermediate form lacking a distal patch (as found in <it>H. erato notabilis</it>) and have a shortened basal patch showing melanin anterior to the costal vein (this area is normally light colored in <it>H. himera</it>).</p>
         </sec>
         <sec>
            <st>
               <p>Isolating and characterizing AFLP markers</p>
            </st>
            <p>We screened all primer combinations that generated BI or MI AFLP markers strongly associated with particular color pattern genes in our initial "bulked" sample across all 120 individuals from our mapping family. We excised and cloned those tightly linked AFLP markers larger than 160 base-pairs using a three-step strategy. First, the band was isolated from a polyacrylamide gel using a LI-COR<sup>&#174; </sup>Biosciences Odyssey<sup>&#174; </sup>Infrared Imaging System. We followed methods outlined in <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> and used a grid to position and excise specific fragments with a scalpel. We validated each excision by re-scanning the gel to confirm that we had removed the correct fragment. All gel fragments were placed in 15 ml of 1&#215; TE and frozen at -80&#176;C. Next, we re-amplified the AFLP using the original selective primer combination and the original PCR conditions. We generated template for this reaction by freeze/thawing the excised band three times. In each freeze/thaw cycle, we collected the resulting supernatant after the band was frozen for one hour at -80&#176;C, heated at 55&#176;C for 15 min and centrifuged for 15 min at 15,000 rpm. We checked amplification success by running PCR products on polyacrylamide gels adjacent to the positive control original AFLP reactions. Finally, we cloned the PCR product using Invitrogen's TOPO TA<sup>&#174; </sup>Cloning kit. PCR amplified inserts from 10&#8211;15 positive clones were again verified size on a polyacrylimide gel and those of the correct size were sequenced using DYEnamicT ET Terminator Cycle Sequencing Kit (Amersham Biosciences). Sequencing reactions were run on a MegaBACE 500 (Amersham Bioscience) or a 3130 DNA PRISM (ABI) at the Sequencing Facilities of the University of Puerto Rico, Rio Piedras. Resulting sequences were aligned by eye and PCR primers were designed using OLIGO version 4.0, and tested in genomic extracts from a panel of <it>H. erato </it>individuals.</p>
         </sec>
         <sec>
            <st>
               <p>Probing the <it>H. erato </it>BAC library</p>
            </st>
            <p>Two <it>H. erato </it>BAC libraries, one partially restricted with <it>Eco</it>RI and one with <it>BamHI</it>, were created from a line of <it>H. erato petiverana </it>collected in Gamboa, Panama and inbred for several generations. The <it>H. erato </it>BAC library was constructed by C. Zhang (TAMU) and M. R. Goldsmith (URI) following the procedure outlined in Wu et al. <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Both libraries contain 19,200 clones arrayed in 384-well plates and the average insert size for the <it>Eco</it>RI and <it>Bam</it>HI libraries was estimated to be 153 kb and 175 kb, respectively. Libraries were gridded onto nylon membranes using a strategy where each clone is spotted twice to facilitate the identification of "true" positives. AFLP probes were labeled with P<sup>32 </sup>using the Prime-It II Random Primer Labeling Kit (Stratagene, CA, USA). The resulting radioactive labeled products were cleaned using a sephadex purification column and hybridized to the filters overnight at 65 degrees in Church Buffer (0.5 M NaHPO4 pH 7.2, 7%SDS, 1 mM EDTA, and 1%BSA) with rotation. The filters were washed twice with 2&#215; SSC +0.1%SDS, then once or twice with 1&#215; SSC + 0.1%SDS. The washed filters were placed on film for 1&#8211;5 days at -80 degrees, depending on signal strength.</p>
         </sec>
         <sec>
            <st>
               <p>BAC fingerprinting, sequencing, and annotation</p>
            </st>
            <p>All positive clones identified in our screen of the <it>H. erato </it>BAC libraray were PCR-confirmed using probe specific primers. Clones were then grown overnight on agar plate and single colony was used to inoculate TB media. Insert DNA was extracted using the Qiagen Maxi prep kit. This insertion size of each BAC clone is estimated by summing all fragments after restriction enzyme digestion using <it>Eco</it>RI or <it>Bam</it>HI. The largest clone identified from our fingerprinting experiments was sequenced and assembled by the Baylor Genomics Center. Clones were first sheared to create 4&#8211;6 kb fragments and subcloned into pUC19. Approximately 8&#215; sequence coverage of each BAC was then generated in paired 600&#8211;800 bp reads. Data were assembled using PHRAP <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and edited in a GAP4 <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> database.</p>
            <p>The BAC sequences were analyzed using a variety of sequence annotation programs. First we used Kaikogaas <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, an automated annotation package for gene prediction <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, to identify possible genes. Kaikogaas integrates a variety of programs for gene prediction and structural analysis of genomic sequence including software for coding region and splice-site prediction, sequence homology analysis, protein localization site prediction, and protein classification and secondary structure prediction. All putative open reading frames were also compared directly to our database of <it>Heliconius </it>Expressed Sequence Tags (ESTs) located in ButterflyBase <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> using BLAST <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Any EST that showed highly significant similarity (e-value of &#8805; 10<sup>-40</sup>) to a putative ORF was then directly compared to the full genomic sequence. In this way, we could better assess possible homology versus similarity due to repetitive DNA elements (see below). To identify the <it>B. mori </it>sequences homologous to the <it>H. erato </it>BAC sequences, we performed a tBLASTn against the whole-genome shotgun reads of <it>B. mori </it>using as a query the translated protein sequence predicted by Kaikogaas.</p>
         </sec>
         <sec>
            <st>
               <p>Comparative sequence analysis</p>
            </st>
            <p>A linkage strategy similar to ours was previously used to identify BACs near the <it>Yb </it>gene complex in <it>H. melpomene </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Sequencing and finishing of three contiguous <it>H. melpomene </it>BACs across this region was carried out by the Wellcome Trust Sanger Institute (accession numbers: CR974474, CT955980, CU367882). We used VISTA <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> to identify similarities between <it>H. erato</it>, <it>H. melpomene</it>, and <it>B. mori </it>genomic sequences. Pairwise genomic alignments were performed on the mVISTA server using the Avid alignment algorithm and the results were displayed together with the position of the annotations (ORFs, genes, mobile DNA, microsatellites). Each annotation or new genomic region identified from the comparative analysis that showed a strong similarity (&#8805; 80% conserved) was verified by aligning the sequences from both species using the LAGAN program <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> to get an accurate DNA sequence assembly.</p>
            <p>We also searched the <it>H. erato </it>and <it>H. melpomene </it>BAC sequences for novel repetitive sequence as well as previously described transposable elements. To identify novel repetitive sequences in the BACs, we used the RepeatFinder software <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B48">48</abbr></abbrgrp>. RepeatFinder is explicitly designed to find complex repeated motifs in large contiguous blocks genomic DNA sequence. Its algorithm uses BLAST to iteratively query segments of the input sequence against the intact input sequence. It then combines subsequences with high BLAST similarity into groups, which are further refined by considering the variability in length and divergence among constituent subsequences. The final output is a list of groups of aligned, highly similar subsequences. It is important to note that because the algorithm uses local alignments (generated by BLAST), different groups may overlap in part or in whole. This overlap allows the groups to be clustered into just a few contigs representing a set of "core motifs" to which each group can be uniquely assigned. Thus, each core motif is a consensus of consenses.</p>
            <p>We submitted the concatenated BAC sequences from each species to RepeatFinder using default parameters except for the following: Block Size = 2000, Minimum Repeat Size = 20, Maximum Repeat Size = 700. Larger values for the Maximum Repeat Size parameter were tried, but no groups larger than ~600 bp were ever identified. Core motifs were assembled from group consensus sequences >35 bp in length using the default parameters in CodonCode Aligner software (CodonCode Corp., Dedham, MA). For both species' sets of core motifs, all-vs-all BLASTn searches were performed 1) between species to identify motifs shared between species and 2) within species to verify that motifs within species were unrelated. BLASTn was also used to search two additional databases for sequences similar to the core motifs. First we searched a combined database of the completed genomic or whole genome shotgun sequences from all 20 insect genome projects currently available at NCBI. We also searched all arthropod transposable elements available in the Repbase library of transposable elements <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <p>We simultaneously identified and masked BAC regions corresponding to both the novel core motifs and known repetitive sequences using RepeatMasker <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. We combined core motif sequences with sequences of all arthropod transposable elements from RepBase into a single database. We queried this database with the concatenated BAC sequences using RepeatMasker and the CrossMatch search algorithm on the 'slow' setting to maximize sensitivity. The resulting data was summarized using custom scripts implemented in the R statistics package <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>RP assisted in project design, mapping, marker isolation, sequence analysis, figure preparation, and writing manuscript. CMM and GH assisted in screening BAC clones. JRW and BAC assisted with sequence analysis and writing the manuscript. RC assisted in BAC sequencing and assembly of <it>Heliconius erato</it>. R-fC and NC assisted in BAC sequencing and assembly of <it>Heliconius melpomene</it>. DDK assisted in crossing and design of AFLP bulk strategy, AFLP mapping, figure preparation, and commenting on the manuscript. CDJ and LF provided <it>H. melpomene </it>data and commented on the manuscript. RDR assisted in marker isolation, sequence analysis, figure preparation, and writing the manuscript. WOM assisted with project design, crossing and mapping, sequence analysis, and writing the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The work was funded by U.S. National Science Foundation grants IOB 0344705 and DEB 0715096 to WOM. The <it>H. erato </it>BAC library was constructed by C. Wu, H. Zhang (TAMU), and M. R. Goldsmith (URI) under NSF Grant IBN-0208388. In addition, the Computational Biology Service Unit at Cornell University, which is partially funded by Microsoft Corporation, provide