<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-8-259</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Evolutionary genomics of plant genes encoding N-terminal-TM-C2 domain proteins and the similar FAM62 genes and synaptotagmin genes of metazoans</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Craxton</snm>
               <fnm>Molly</fnm>
               <insr iid="I1"/>
               <email>molly@mrc-lmb.cam.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>259</fpage>
         <url>http://www.biomedcentral.com/1471-2164/8/259</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17672888</pubid>
               <pubid idtype="doi">10.1186/1471-2164-8-259</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>22</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>31</day>
               <month>7</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>31</day>
               <month>7</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Craxton; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Synaptotagmin genes are found in animal genomes and are known to function in the nervous system. Genes with a similar domain architecture as well as sequence similarity to synaptotagmin C2 domains have also been found in plant genomes. The plant genes share an additional region of sequence similarity with a group of animal genes named <it>FAM62. FAM62 </it>genes also have a similar domain architecture. Little is known about the functions of the plant genes and animal <it>FAM62 </it>genes. Indeed, many members of the large and diverse <it>Syt </it>gene family await functional characterization. Understanding the evolutionary relationships among these genes will help to realize the full implications of functional studies and lead to improved genome annotation.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>I collected and compared plant <it>Syt</it>-like sequences from the primary nucleotide sequence databases at NCBI. The collection comprises six groups of plant genes conserved in embryophytes: <it>NTMC2Type1 </it>to <it>NTMC2Type6</it>. I collected and compared metazoan <it>FAM62 </it>sequences and identified some similar sequences from other eukaryotic lineages. I found evidence of RNA editing and alternative splicing. I compared the intron patterns of <it>Syt </it>genes. I also compared Rabphilin and Doc2 genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Genes encoding proteins with N-terminal-transmembrane-C2 domain architectures resembling synaptotagmins, are widespread in eukaryotes. A collection of these genes is presented here. The collection provides a resource for studies of intron evolution. I have classified the collection into homologous gene families according to distinctive patterns of sequence conservation and intron position. The evolutionary histories of these gene families are traceable through the appearance of family members in different eukaryotic lineages. Assuming an intron-rich eukaryotic ancestor, the conserved intron patterns distinctive of individual gene families, indicate independent origins of <it>Syt</it>, <it>FAM62 </it>and <it>NTMC2 </it>genes. Resemblances among these large, multi-domain proteins are due not only to shared ancestry (homology) but also to convergent evolution (analogy). During the evolution of these gene families, duplications and other gene rearrangements affecting domain composition, have occurred along with sequence divergence, leading to complex family relationships with accordingly complex functional implications. The functional homologies and analogies among these genes remain to be established empirically.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Synaptotagmins (Syts) share a common structure: an N-terminal transmembrane (TM) sequence followed by a variable length linker and two tandem, distinctly conserved C2 domains, C2A and C2B. Syt1 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> identified as a protein component of synaptic vesicles, is known to be required for nervous system function, acting crucially in the fast, synchronous component of calcium regulated synaptic vesicle exocytosis <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Genomic analysis of <it>Syt </it>genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp> indicates that animal genomes encode diverse sets of <it>Syt </it>genes but always maintain a <it>Syt1 </it>orthologue. Although it is likely that <it>Syt1 </it>orthologues function similarly <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> the functions of the other <it>Syt </it>genes, in different species, still remain to be established. The complexity of this task increases with the number of <it>Syt </it>genes and these increase along with organism complexity. The first study of the full set of <it>Syt </it>genes in a model organism <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> indicated that only <it>Syt1 </it>is expressed on synaptic vesicles. The other <it>Syt </it>genes were found to be expressed in different and distinct places. Many studies using different mammalian <it>Syt </it>genes, indicate tissue distributions which are primarily neural eg. [<abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> and references therein]. Naturally occurring, cell type-specific expression patterns have, however, rarely been described eg. [<abbrgrp><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp> and references therein]. The discovery of genes in plants which are similar to <it>Syt </it>genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B14">14</abbr></abbrgrp> further complicates functional predictions. While the plant genes and another group of animal genes (<it>FAM62</it>) share similarity with <it>Syt </it>genes, little is known about their functions. A preliminary biochemical analysis of proteins from the human <it>FAM62 </it>gene family has just been published <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> but growing speculation about the plant genes <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp> necessitates a more detailed description of their similarities and differences which could usefully inform future functional studies. I have made use of the abundance of recently deposited nucleotide sequences from a wide range of organisms, to carry out a comparative genomics analysis of these genes, in order to shed light on their evolutionary relationships.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Collection of plant gene sequences</p>
            </st>
            <p>In order to undertake a comparative analysis of the plant <it>Syt</it>-like genes, I collected and compared full-length homologues from an evolutionary range of plants. In order to perform an unbiased search for as many homologues of these relatively unknown genes as possible, I looked at all of the primary nucleotide sequence data in the NCBI sequence databases <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. This information is fragmentary, little of it being in the form of complete sequences, either of transcripts or genomes. By far the most abundant source of new plant sequences are ESTs, but these represent particularly small fragments and their sequences are not determined to high accuracy. I therefore needed to gather sets of overlapping ESTs to find full-length gene sequences. In order to focus the search to the detection of genuinely homologous sequences, I used nucleotide sequence probes of plant sequences already identified. Only those database sequences closely related to the probe sequence would be identified in a given search. These matching sequences were added to the collection and joined to any overlapping sequences already present in the collection. Reiterated searches served to expand the collection and extend the length of gene fragments. Had I used amino acid sequence probes to search for homologues of these genes, I would have detected a wider range of fragments with amino acid similarity, but these would not necessarily be homologous. Overlapping nucleotide sequences would be required in any case, to piece together whole genes from the identified EST fragments, so the simplest strategy to gather full-length relatives of these genes was to use nucleotide probes. I avoided gathering processed sequences in the sequence databases: these include genes predicted from genome annotation pipelines, as well as the vast majority of amino acid sequences which are predicted from nucleotide sequences. These sequences may not be accurate and could mislead subsequent analyses if used without verification.</p>
            <p>So I carried out reiterated rounds of blastn searching of nucleotide sequences at NCBI <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. In the first few rounds, I used probes representing the plant gene coding sequences I had already identified (genes 85 to 117) <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. After each round, I collected all of the statistically significant hits with high scoring segments longer than 30 nucleotides and assembled these sequences into a gap4 database <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Repeated searching with different probes, followed by gap4 assembly of only previously uncollected hits, allowed me to gradually but efficiently build a comprehensive collection. Each probe detected a unique spectrum of homologous plant sequences. Probes from a given species could be used to find similar sequences from related species. Probes covering more conserved regions could be used to find sequences from a wider range of relatives. Sequences from closely related species could be used to bridge non-overlapping contigs from a single species. In the later stages of the collection process, I carefully separated the contigs so that in most cases, each represents a set of overlapping sequences from one species only. As a final step, to ensure that the collection was as comprehensive as it could be at this time, I searched the nucleotide sequences at NCBI using tblastn with amino acid sequence probes and confirmed that the top scoring hits had already been collected.</p>
            <p>As well as examining transcript sequences, I also collected genomic sequences where available. I particularly wanted to examine the genome of <it>Physcomitrella patens </it>which is currently being sequenced <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. I had previously identified <it>Syt</it>-like genes in the genome sequences of <it>Arabidopsis thaliana </it>and <it>Oryza sativa </it>but both of these represent relatively recently evolved angiosperms whereas the moss genome represents an ancient bryophyte. I used the trace archive at NCBI <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> as well as resources at PHYSCObase <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> where transcript sequences are also available. I confirmed the genomic and transcript sequences from several <it>Physcomitrella patens </it>gene loci and deposited these sequences in the public databases [EMBL:<ext-link ext-link-type="embl" ext-link-id="AM140045">AM140045</ext-link>, EMBL: <ext-link ext-link-type="embl" ext-link-id="AM140046">AM140046</ext-link>, EMBL: <ext-link ext-link-type="embl" ext-link-id="AM140047">AM140047</ext-link>, EMBL: <ext-link ext-link-type="embl" ext-link-id="AM140048">AM140048</ext-link>, EMBL: <ext-link ext-link-type="embl" ext-link-id="AM140049">AM140049</ext-link>, EMBL: <ext-link ext-link-type="embl" ext-link-id="AM140050">AM140050</ext-link>]. In contrast to animal <it>Syt </it>genes, which appear to increase in number along with organism complexity <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, I found that the haploid genome of <it>Physcomitrella patens </it>has even more of these plant genes (19 or more) than either <it>Oryza sativa </it>(13) or <it>Arabidopsis thaliana </it>(11). Additional file <supplr sid="S1">1</supplr> lists full details of each gene identified. Additional file <supplr sid="S2">2</supplr> lists alphabetically, in rough phylogenetic order, all of the plant species in which genes in this collection have been identified. Genes were identified in a wide evolutionary range of land plants, from bryophytes to rosids.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Plant <it>NTMC2 </it>genes.</p>
               </text>
               <file name="1471-2164-8-259-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>Plant species.</p>
               </text>
               <file name="1471-2164-8-259-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Analysis of full-length plant genes</p>
            </st>
            <p>Database searching identified six distinct groups of plant genes. Since all of the genes encode relatively long proteins, most of the collection comprises gene fragments which cannot yet be extended to full-length. Only where a large number of overlapping sequences were available was it possible to derive full-length gene sequences from EST contigs. Consequently, the full-length sequences represent the relatively abundantly transcribed, or the shorter genes. Genomic sequences were useful for identifying full-length sequences, irrespective of transcript abundance, as well as for providing the intron-exon structure of the gene. Full-length amino acid sequences were compared using Multalin <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The previously used nomenclature (<it>SytA</it>, <it>SytB</it>, <it>SytC </it>etc.) following <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> is somewhat arbitrary and is inadequate for a consistent and meaningful description of these plant genes. I propose the following naming convention for these plant N-terminal-TM-C2 domain genes: <it>NTMC2Type1.1</it>, <it>NTMC2Type1.2</it>, <it>NTMC2Type6 </it>and so on. Multiple alignments of full-length sequences from each group are presented in figures <figr fid="F1">1</figr>, <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, <figr fid="F5">5</figr>, <figr fid="F6">6</figr></p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>plant <it>NTMType1 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMType1 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Alternatively spliced regions are boxed. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>plant <it>NTMC2Type2 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMC2Type2 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. RNA edited regions are boxed. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>plant <it>NTMC2Type3 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMC2Type3 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>plant <it>NTMC2Type4 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMC2Type4 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. TM regions are boxed. Intron positions and phases are marked. C2 domains are indicated. Alternatively spliced regions are boxed. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>plant <it>NTMC2Type5 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMC2Type5 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. TM regions are boxed. Intron positions and phases are marked. C2 domains are indicated. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>plant <it>NTMC2Type6 </it>genes</p>
               </caption>
               <text>
                  <p><b>plant <it>NTMC2Type6 </it>genes</b>. Amino acid sequences of full-length gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Full details are in additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <graphic file="1471-2164-8-259-6"/>
            </fig>
            <p>Figures to <figr fid="F1">1</figr>, <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, <figr fid="F5">5</figr>, <figr fid="F6">6</figr> show the overall domain pattern common to all of the genes: the N-terminal region, TM region, linker, C2 domain region and C-terminal region. Strongly conserved intron patterns, as well as distinctive patterns of sequence conservation, distinguish the six types of <it>NTMC2 </it>genes. The six groups are not entirely homogeneous. <it>Physcomitrella patens NTMC2Type2.3 </it>for example, while sharing its bulk with the other members of the <it>NTMC2Type2 </it>group, has different N and C termini and lacks the second C2 domain. The <it>NTMC2Type2 </it>group is also notable in that some of its members are RNA edited (see figure <figr fid="F2">2</figr> and full details in additional file <supplr sid="S1">1</supplr>). In some members of this group, the genomic sequence of the second coding exon lacks one nucleotide at its 3' terminus, resulting in a faulty, frameshifted gene. However, these genes are still able to produce functional transcripts with the missing guanosine restored. Transcripts for both <it>Arabidopsis thaliana NTMC2Type2 </it>genes, and the <it>Oryza sativa NTMC2Type2.2 </it>gene are edited in this way. Transcript sequences have not yet been deposited in the sequence databases for the <it>Physcomitrella patens NTMC2Type2.1 </it>and the <it>Medicago truncatula NTMC2Type2.2 </it>genes, but I have assumed that they are similarly edited. The genomic loci of the <it>Physcomitrella patens NTMC2Type2.2 </it>and <it>NTMC2Type2.3 </it>genes and the <it>Oryza sativa NTMC2Type2.1 </it>gene, do not lack the equivalent nucleotide, and are not frameshifted. The genomic locus of the <it>Medicago truncatula NTMC2Type2.1 </it>gene lacks the equivalent exon-intron boundary altogether and is not frameshifted. The first coding exon of the <it>Medicago truncatula NTMC2Type2.1 </it>gene is equivalent to a fusion of the first three coding exons of the <it>NTMC2Type2 </it>genes mentioned above, with the corresponding two introns missing. The frameshift error thus appears to be associated with a particular intron. Other examples of divergent members of a group are <it>Physcomitrella patens NTMC2Type4.3</it>, which diverges at its C terminus and <it>Physcomitrella patens NTMC2Type5.2 </it>and <it>NTMC2Type5.3</it>, which have a different intron pattern. Group 6, as a whole, is not well conserved C-terminal of the C2 domain.</p>
         </sec>
         <sec>
            <st>
               <p>Collection of animal <it>FAM62 </it>genes</p>
            </st>
            <p>I had previously identified genes in metazoans and non-metazoans which encode N-terminal-TM-C2 domain proteins sharing similarity with those of plants <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. In the meantime, with the annotation of the human genome, the three members of this gene family in <it>Homo sapiens </it>have been named <it>FAM62A, FAM62B </it>and <it>FAM62C </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. I sought to identify homologues of these genes in other organisms by tblastn searching genomic sequences, thereby identifying full-length genes and their intron-exon structures. In contrast to the current status of primary nucleotide sequences from plants, many more animal genomic sequences are available to search. One reason for this is that animal genomes are relatively small in comparison to plant genomes and are therefore relatively less expensive to sequence. After identifying <it>FAM62 </it>gene homologues in genomic sequences, I searched transcript sequences using blastn with nucleotide probes, to confirm the predicted gene structures. I identified <it>FAM62 </it>homologues in a range of metazoan genomes. Details of each gene are listed in additional file <supplr sid="S3">3</supplr>.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><it>FAM62 </it>genes.</p>
               </text>
               <file name="1471-2164-8-259-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Analysis of full-length <it>FAM62 </it>genes</p>
            </st>
            <p>Full-length amino acid sequences were compared using Multalin <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Figure <figr fid="F7">7</figr> shows a multiple alignment of the three <it>FAM62 </it>gene products from <it>Homo sapiens</it>. All three share a common gene structure, but while <it>FAM62B </it>and <it>FAM62C </it>each encode three C2 domains, <it>FAM62A </it>contains a repeat of the portion of the gene encoding the first two C2 domains, resulting in a total of five C2 domains. Figure <figr fid="F8">8</figr> shows a multiple alignment of four <it>FAM62 </it>gene products from <it>Danio rerio</it>. The genome of <it>Danio rerio </it>encodes at least four <it>FAM62 </it>genes. All four share a common gene structure, but while <it>FAM62B </it>and <it>FAM62C </it>each encode three C2 domains, the <it>FAM62A </it>homologues contain additional repeats of the module which encodes the first two C2 domains. This results in a total of five C2 domains in <it>FAM62A1 </it>and nine C2 domains in <it>FAM62A2</it>.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Three <it>Homo sapiens FAM62 </it>genes</p>
               </caption>
               <text>
                  <p><b>Three <it>Homo sapiens FAM62 </it>genes</b>. Amino acid sequences of the gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. An alternatively spliced region in the second C2 domain is boxed.</p>
               </text>
               <graphic file="1471-2164-8-259-7"/>
            </fig>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Four <it>Danio rerio FAM62 </it>genes</p>
               </caption>
               <text>
                  <p><b>Four <it>Danio rerio FAM62 </it>genes</b>. Amino acid sequences of the gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Repeated modules are underlined.</p>
               </text>
               <graphic file="1471-2164-8-259-8"/>
            </fig>
            <p>Figure <figr fid="F9">9</figr> shows a multiple alignment of <it>FAM62 </it>gene products from a range of metazoans. The genomes of <it>Drosophila melanogaster</it>, <it>Anopheles gambiae</it>, <it>Apis mellifera</it>, <it>Strongylocentrotus purpuratus</it>, <it>Ciona intestinalis </it>and <it>Caenorhabditis elegans </it>each appear to encode one <it>FAM62</it>. The genome of <it>Tribolium castaneum </it>has an unusual and compact <it>FAM62 </it>locus. It is approximately 12 kilobases long and contains three closely spaced <it>FAM62 </it>copies in tandem. Only the first copy retains the intron pattern common to other <it>FAM62 </it>genes and is shown in figure <figr fid="F9">9</figr>. The other two copies have diverged from the first and from each other, both in terms of amino acid sequence and intron position (see figure <figr fid="F10">10</figr> and further details in additional file <supplr sid="S3">3</supplr>). A duplicated, alternative exon in the region of the first C2 domain of the insect <it>FAM62 </it>genes is shown boxed in figures <figr fid="F9">9</figr> and <figr fid="F10">10</figr>. This is reminiscent of alternative readings of the C2B region of certain <it>Syt1 </it>genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. In the <it>Syt1 </it>cases, the insect <it>Syt1 </it>genes employ RNA editing to alter this region, while <it>Caenorhabditis elegans </it>and <it>Aplysia californica </it>encode duplicate alternative exons like the insect <it>FAM62 </it>genes here. The duplicated alternative exons are absent from the second and third <it>FAM62 </it>copies in <it>Tribolium castaneum </it>(figure <figr fid="F10">10</figr>). Vertebrate genomes encode at least three <it>FAM62 </it>genes. The <it>FAM62B </it>genes of vertebrates appear to be most similar to the single <it>FAM62 </it>genes of other organisms and are therefore included in figure <figr fid="F9">9</figr>. Alternatively spliced regions of <it>Homo sapiens </it>and <it>Mus musculus FAM62B </it>are shown boxed as well as some alternatively spliced regions of <it>Ciona intestinalis FAM62</it>.</p>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Metazoan <it>FAM62 </it>genes</p>
               </caption>
               <text>
                  <p><b>Metazoan <it>FAM62 </it>genes</b>. Amino acid sequences of the gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Alternatively expressed regions are boxed.</p>
               </text>
               <graphic file="1471-2164-8-259-9"/>
            </fig>
            <fig id="F10">
               <title>
                  <p>Figure 10</p>
               </title>
               <caption>
                  <p>Three <it>Tribolium castaneum FAM62 </it>genes</p>
               </caption>
               <text>
                  <p><b>Three <it>Tribolium castaneum FAM62 </it>genes</b>. Amino acid sequences of the gene products are aligned. The N-terminal TM region is boxed. Intron positions and phases are marked. C2 domains are indicated. Alternatively expressed regions are boxed.</p>
               </text>
               <graphic file="1471-2164-8-259-10"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of the structure of Syt genes</p>
            </st>
            <p>Collection and analysis of the plant <it>NTMC2 </it>genes and animal <it>FAM62 </it>genes revealed intron patterns which are highly conserved within the different groups, implying a long evolutionary history for the whole length of each gene. I have previously looked at the intron patterns of <it>Syt </it>genes and found strong conservation of particular intron positions <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. To make clear the differences between the plant and animal N-terminal-TM-C2 domain genes and <it>Syt </it>genes which are also N-terminal-TM-C2 domain genes, I analyzed the intron positions within the coding regions of <it>Syt </it>genes from a wide a range of metazoans. Details of <it>Syt </it>genes shown here but not previously reported <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> are in additional file <supplr sid="S4">4</supplr>.</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>New <it>Syt </it>sequences, Rabphilin and Doc2 sequences.</p>
               </text>
               <file name="1471-2164-8-259-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Figure <figr fid="F11">11</figr> shows an overview of the intron patterns in <it>Syt </it>genes. Intron positions and their phases are shown relative to TM, C2A and C2B domains. The conserved introns between the C2A and C2B domains stand out clearly. I have included <it>Syt17 </it>(also known as B/K <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>) homologues here. Although <it>Syt17 </it>homologues lack the N-terminal TM domain and were therefore excluded from my previous analysis <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> their intron structure is indeed characteristic of <it>Syt </it>genes and different from other <it>Syt-</it>like genes, such as those encoding Doc2 and Rabphilin proteins (figure <figr fid="F12">12</figr>, details in additional file <supplr sid="S4">4</supplr>). The HUGO gene nomenclature committee <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> have agreed to name the <it>Homo sapiens </it>gene locus <it>SYT17 </it>so I follow this nomenclature here. The finding of a <it>Syt9 </it>homologue in <it>Strongylocentrotus purpuratus </it>expands beyond vertebrates a group of <it>Syt </it>genes <it>(Syt3, Syt6, Syt9 </it>and <it>Syt10) </it>previously seen only in vertebrates. I have identified additional <it>Syt </it>genes in genomes examined previously. The <it>Ciona intestinalis Syt</it>&#945; (following the nomenclature of <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>) is a previously unidentified member of a group present in <it>Caenorhabditis elegans</it>, <it>Drosophila melanogaster</it>, <it>Anopheles gambiae </it>and <it>Ciona intestinalis </it>but not present in <it>Strongylocentrotus purpuratus</it>, <it>Danio rerio </it>or <it>Homo sapiens</it>. The <it>Danio rerio </it>genome sequence is still being completed and has yielded substantially more information since my last analysis <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
            <fig id="F11">
               <title>
                  <p>Figure 11</p>
               </title>
               <caption>
                  <p>Intron pattern in <it>Syt </it>coding regions</p>
               </caption>
               <text>
                  <p><b>Intron pattern in <it>Syt </it>coding regions</b>. This figure shows an overview of the structures of <it>Syt </it>genes. Intron positions relative to TM, C2A and C2B domains, and their phases are indicated. Phase 0 introns are indicated by black dots. Phase 1 introns are indicated by red dots. Phase 2 introns are indicated by blue dots. Question marks indicate unknown regions where the genomic sequence is incomplete. The positions of additional alternative exons [11,4] are indicated by asterisks. Groups of likely orthologues are indicated in shades of blue. Groups of likely paralogues are indicated in shades of red.</p>
               </text>
               <graphic file="1471-2164-8-259-11"/>
            </fig>
            <fig id="F12">
               <title>
                  <p>Figure 12</p>
               </title>
               <caption>
                  <p>Rabphilin and Doc2 genes</p>
               </caption>
               <text>
                  <p><b>Rabphilin and Doc2 genes</b>. Amino acid sequences of the gene products are aligned. Intron positions and phases are marked. The Rabphilin effector domain is boxed and C2 domains are indicated.</p>
               </text>
               <graphic file="1471-2164-8-259-12"/>
            </fig>
            <p>In figure <figr fid="F11">11</figr>, I have arranged the <it>Syt </it>genes into groups of likely orthologues and paralogues. Genes from different species, which are more similar to each other than to other genes from the same species, can be classed as orthologues, and thus defined, are taken to be related by vertical descent from a common ancestor <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. The functional implications of such a relationship are that orthologues may fulfil similar, perhaps equivalent, roles in different species. As mentioned in the Background section of this paper, this may be broadly true for <it>Syt1 </it>genes which appear to be present in all animals. The intron pattern distinctive of <it>Syt1 </it>genes, is highly similar to the intron patterns of the <it>Syt2</it>, <it>Syt5 </it>and <it>Syt8 </it>genes. These genes appear only in the evolutionarily more modern vertebrate lineages, so it is likely that they have arisen via <it>Syt1 </it>duplication during the evolution of vertebrate lineages and could therefore be classed as paralogues, relative to <it>Syt1</it>. The functional implications of such a relationship are that paralogues may fulfil a subset of the roles of the parent orthologue through a process of subfunctionalization, or acquire new roles through a process of neofunctionalization <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. The <it>Syt11 </it>genes appear similarly related to the <it>Syt4 </it>group and the <it>Syt14 </it>genes similarly related to the <it>Syt16 </it>group. The <it>Syt6</it>, <it>Syt10 </it>and <it>Syt3 </it>genes also appear similarly related to the <it>Syt9 </it>group. Until a more complete picture emerges from the accurate identification of complete genome complements of <it>Syt </it>genes and <it>Syt</it>-like genes from many more eukaryotic lineages, it will not be possible to classify these genes more accurately as orthologues and paralogues.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>I have examined groups of genes in plants and animals which encode N-terminal TMs followed by a linker and one or more C2 domains. The <it>NTMC2 </it>genes and the <it>FAM62 </it>genes share sequence similarity in the linker region between the N-terminus and the first C2 domain. This region has recently been identified as a conserved domain of unknown function named SMP <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The <it>NTMC2 </it>genes have one or two C2 domains and the <it>FAM62 </it>genes have three or more C2 domains. The plant genes and the animal genes each have modular gene structures with conserved intron positions. Figure <figr fid="F13">13</figr> shows a summary of the structures of the <it>FAM62 </it>genes and the <it>NTMC2 </it>genes.</p>
         <fig id="F13">
            <title>
               <p>Figure 13</p>
            </title>
            <caption>
               <p>Structures of 6 groups of plant genes and the similar FAM62 genes of metazoans</p>
            </caption>
            <text>
               <p><b>Structures of 6 groups of plant genes and the similar FAM62 genes of metazoans</b>. TM regions are represented by red boxes and C2 domains by blue boxes. Intron positions and phases are indicated. Those within square brackets are not always present.</p>
            </text>
            <graphic file="1471-2164-8-259-13"/>
         </fig>
         <p><it>FAM62</it>-like genes are identifiable in yeasts and fungi, but their more divergent sequences and general lack of introns set them apart from the group of metazoan <it>FAM62 </it>genes and I have not analysed them here. I have identified similar genes in other non-metazoans, such as <it>Trypanosoma brucei</it>, <it>Ostreococcus tauri </it>and <it>Cyanidioschyzon merolae</it>, but these too are quite divergent and lack introns (details in additional file <supplr sid="S5">5</supplr>). All of the full-length nucleotide sequences in this paper are listed in additional file <supplr sid="S6">6</supplr>. All of the full-length amino acid sequences in this paper are listed in additional file <supplr sid="S7">7</supplr>.</p>
         <suppl id="S5">
            <title>
               <p>Additional file 5</p>
            </title>
            <text>
               <p>Other non-metazoan genes.</p>
            </text>
            <file name="1471-2164-8-259-S5.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional file 6</p>
            </title>
            <text>
               <p>All full-length nucleotide sequences.</p>
            </text>
            <file name="1471-2164-8-259-S6.txt">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S7">
            <title>
               <p>Additional file 7</p>
            </title>
            <text>
               <p>All full-length amino acid sequences.</p>
            </text>
            <file name="1471-2164-8-259-S7.txt">
               <p>Click here for file</p>
            </file>
         </suppl>
         <p>The <it>NTMC2Type1</it>, <it>NTMC2Type2 </it>and <it>NTMC2Type3 </it>genes are <it>Syt</it>-like, in that they have an N-terminal TM and two separately conserved C2 domains. Their conserved intron patterns distinguish them from <it>Syt </it>genes which have only been found in metazoans and have their own distinctive intron patterns. The <it>NTMC2Type1</it>, <it>NTMC2Type2 </it>and <it>NTMC2Type4 </it>genes are highly similar up to the first C2 domain, indicating a possible gene fusion or fission.</p>
         <p>A gene fission event is apparent in the genes encoding Doc2 and Rabphilin proteins (figure <figr fid="F12">12</figr>, details in additional file <supplr sid="S4">4</supplr>). Rabphilin and Doc2 are related proteins, each with two tandem C-terminal C2 domains which share amino acid sequence similarity with Syt C2 domains. They have partly shared gene structures. The genes encoding the Doc2 proteins comprise the C-terminal half of the genes encoding Rabphilin and thus lack the N-terminal Rabphilin effector domain. Whereas genes encoding Rabphilin are widely distributed among metazoans, genes encoding Doc2 appear to have arisen in the vertebrate lineage. <it>Ciona intestinalis </it>has one Rabphilin gene and no Doc2 genes. <it>Mus musculus </it>has one Rabphilin gene and three Doc2 genes. Figure <figr fid="F12">12</figr> illustrates these sequences and their common gene structure. The conserved intron positions help to clarify the relationship between the Doc2 genes and the Rabphilin genes. The intron patterns within the C2 domain regions of these genes appear dissimilar to those of any of the other groups of C2 domains analysed here, further demonstrating that genes which share similarity at the amino acid level, can be divided into genuinely homologous families on the basis of their gene structures.</p>
         <p>The difficulty of applying a consistent and meaningful gene nomenclature is highlighted by this work. In the past, gene naming was usually the result of slow and painstaking research. Genes were given names indicating a phenotype or functional aspect of an expressed product. Now in the genome era, vast numbers of genes are appearing at great speed. To make sense of all this new information, evolutionary genomics <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> aims to dissect the complex relationships between genes in different life forms over evolutionary time scales, thereby improving genome annotation. Genes can express multiple functional products and be regulated differently in different contexts. This means that it cannot be straightforward to predict the functional consequences of variations at particular genomic loci, in different species or even different individuals. Functional annotation of genomes is therefore not a straightforward task.</p>
         <p>There is already confusion with <it>Syt </it>nomenclature (see for example <it>SYT5, Syt5</it>, <it>SYT9 </it>and <it>Syt9 </it>in the Gene and Pubmed databases at NCBI). Equivalent genomic loci in different species can be given different names through separate genome annotation pipelines, and the community of researchers engaged in functional studies of the gene products, continue to supply yet more names relating to the particular functions they have studied (for example, see <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>). In this paper I have named the <it>Syt</it>&#945; genes, which lack human homologues, in line with <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. I have named those with human homologues, according to the HUGO gene nomenclature committee approved human gene names <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Three <it>Syt </it>genes in <it>Caenorhabditis elegans </it>remain unclassified at present and I have simply numbered them (1) to (3) for now. The Wormbase <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> nomenclature for <it>Caenorhabditis elegans Syt </it>genes: <it>snt-1 </it>to <it>snt-6 </it>does not (apart from <it>snt-1 </it>being numbered consistent with its relationship to other <it>Syt1 </it>genes) yet take account of their evolutionary relationships. Flybase <abbrgrp><abbr bid="B32">32</abbr></abbrgrp><it>Syt </it>gene names are currently restricted to three of the seven <it>Syt </it>genes in <it>Drosophila melanogaster</it>: <it>Syt1,4 </it>and <it>7 </it>(yet see <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> where four <it>Syt </it>genes were identified in <it>Drosophila melanogaster</it>, but only two of these match Flybase <it>Syt </it>genes, likely due to inaccuracies in the source databases used). While the <it>Homo sapiens </it>and <it>Mus musculus </it>genes encoding Rabphilin have now been named <it>RPH3A </it>and <it>Rph3a</it>, respectively, the genes encoding Doc2 proteins have not yet acquired genome nomenclature committee approved names. I named the <it>FAM62 </it>genes in this paper according to the HUGO gene nomenclature committee approved names, but these names have no functional meaning. I suggest a nomenclature for the plant genes which describes their domain composition. This may have some functional relevance.</p>
         <p>For the future annotation of genomes with homologues of the genes discussed here, it would be useful to incorporate these gene predictions into the sequence databases such that they are obviously visible and appropriately connected. This should be possible via the recently introduced Third Party Annotation (TPA) facility at the NCBI and EMBL nucleotide sequence databases. Genome annotation needs to be updated continuously and the information from separate genome projects integrated. A possible wiki solution to the problem of updating genome annotation has recently been proposed <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>A comparative genomics analysis of genes with N-terminal-TM-C2 domain architectures helps to understand how these genes have evolved. Although it is not possible to draw firm conclusions about the total gene complement of organisms from incomplete genome sequences, such information is needed for sound inferences about the origin and diversification of gene families. The examination of a wide variety of fragmentary sequences does, however, provide much information, useful both for understanding the evolution of genes and their functional products. Large scale, structure-based comparisons of protein sequences inform functional perspectives on the evolution of protein repertoires eg. [<abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp> and references therein]. A structural analysis of eukaryotic C2 domain proteins <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> has considered the evolution of this particular domain. For more gene-oriented perspectives, see eg. <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp> and for a consideration of non-coding sequence evolution, see eg. <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>.</p>
         <p>The collection of genes used here, includes evolutionarily widely dispersed genes with distinctive intron-exon patterns. It includes several gene families with long evolutionary histories. The origins of these gene families are not yet clear but appear to be several. Genome sequences from more lineages of simple, deep-branching eukaryotes may, in future, reveal the earlier histories of these gene families. The collection demonstrates different modes of gene evolution: the C2 domain duplication of <it>FAM62A </it>genes, the whole gene duplication of the <it>Tribolium castaneum FAM62 </it>genes and <it>Mus musculus </it>Doc2 genes, the alternative exons of the C2-1 domain encoded by insect <it>FAM62 </it>genes, the gene fusion/fission of <it>NTMC2Type2</it>/<it>NTMC2Type4 </it>and Rabphilin/Doc2 genes, and the expansion and diversification of the <it>Syt </it>gene family. Intron gains and losses are also demonstrated. Intron movements in the duplicated <it>Tribolium castaneum FAM62 </it>genes and intron movement with functional consequences in the <it>NTMC2Type2 </it>genes are interesting examples. The mechanisms of intron gain and loss and the causes of intron evolution are matters of considerable debate <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B43">43</abbr></abbrgrp>. This gene collection provides some useful information for this area of investigation.</p>
         <p>Different gene products in this collection share a domain architecture which implies membrane proteins tethered by TM domains, which via their C2 domains, interact with lipids, other membranes and other proteins, sometimes in a calcium regulated manner. Functional studies on many of these genes have yet to be undertaken. It remains to be seen exactly what levels of functional equivalence exist even between different members of the same gene family, for example, the <it>Syt </it>gene family. An empirical approach to investigating the functions of plant <it>NTMC2 </it>genes and animal <it>FAM62 </it>genes would therefore seem more wise than attempting to make functional predictions based on their shared structural domains, which are not homologous. Improved understanding of the evolutionary relationships among these genes will help to guide and interpret future functional studies as well as informing the effort to annotate genome sequences. I hope that innovations in gene and genome annotation will in future allow the easy integration of new results from functional studies and that new functional studies can likewise be informed by evolutionary considerations based on good annotation. Complex, eukaryotic genes are difficult to predict accurately from genome sequences and need to be verified by comparison with transcript sequences. This is especially important when subtle gene regulation by alternative splicing and RNA editing is involved. Ideally, in time, it will be possible to integrate all sources of data into a comprehensible resource.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Cloning and sequencing of <it>Physcomitrella patens </it>genes</p>
            </st>
            <p><it>Physcomitrella patens </it>genomic DNA was a gift from Didier Schaefer. I used this as a template for PCR reactions. I amplified genomic regions using Pfu turbo polymerase with phosphorylated primers and cloned the products into Sma digested pBSIIKS-. After sequencing, overlapping clones were selected and digested with restriction enzymes in such a way as to ligate the genomic locus into one piece. The sequence of each genomic clone was deposited in the public sequence databases [EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410046">AM410046</ext-link>, EMBL:<ext-link ext-link-type="embl" ext-link-id="AM4100449">AM4100449</ext-link>, EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410050">AM410050</ext-link>]. cDNA clones, also gifts from Didier Schaefer, were obtained from the M. Hasebe collection <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> at PHYSCObase <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> and sequenced completely. These sequences were deposited in the public sequence databases [EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410045">AM410045</ext-link>, EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410047">AM410047</ext-link>, EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410048">AM410048</ext-link>].</p>
         </sec>
         <sec>
            <st>
               <p>Confirmation of RNA editing of <it>Arabidopsis thaliana NTMC2Type2.2</it></p>
            </st>
            <p>A full-length cDNA clone of <it>Arabidopsis thaliana NTMC2Type2.2 </it>was a gift from Boris Voigt. I confirmed the coding sequence and deposited this in the public sequence databases [EMBL:<ext-link ext-link-type="embl" ext-link-id="AM410051">AM410051</ext-link>].</p>
         </sec>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I wish to thank Didier Schaefer and Boris Voigt for their gifts of plant DNAs.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Phospholipid binding by a synaptic vesicle protein homologous to the regulatory region of protein kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Perin</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Fried</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Mignery</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>S&#252;dhof</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1990</pubdate>
            <volume>345</volume>
            <fpage>260</fpage>
            <lpage>263</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2333096</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Synaptotagmin I: a major Ca<sup>2+ </sup>sensor for transmitter release at a central synapse</p>
            </title>
            <aug>
               <au>
                  <snm>Geppert</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goda</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hammer</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Rosahl</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>CF</fnm>
               </au>
               <au>
                  <snm>S&#252;dhof</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1994</pubdate>
            <volume>79</volume>
            <fpage>717</fpage>
            <lpage>727</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7954835</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Genomic analysis of synaptotagmin genes</p>
            </title>
            <aug>
               <au>
                  <snm>Craxton</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2001</pubdate>
            <volume>77</volume>
            <fpage>43</fpage>
            <lpage>49</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11543631</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Synaptotagmin gene content of the sequenced genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Craxton</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>43</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">471550</pubid>
                  <pubid idtype="pmpid" link="fulltext">15238157</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Structural and functional conservation of synaptotagmin (p65) in Drosophila and humans</p>
            </title>
            <aug>
               <au>
                  <snm>Perin</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Johnston</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Ozcelik</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Franke</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>S&#252;dhof</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1991</pubdate>
            <volume>266</volume>
            <fpage>615</fpage>
            <lpage>622</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1840599</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Synaptic transmission persists in synaptotagmin mutants of <it>Drosophila</it></p>
            </title>
            <aug>
               <au>
                  <snm>DiAntonio</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Parfitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Schwarz</snm>
                  <fnm>TL</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1993</pubdate>
            <volume>73</volume>
            <fpage>1281</fpage>
            <lpage>1290</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8100740</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Expression of synaptotagmin in <it>Drosophila </it>reveals transport and localization of of synaptic vesicles to the synapse</p>
            </title>
            <aug>
               <au>
                  <snm>Littleton</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Bellen</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Perin</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>1993</pubdate>
            <volume>118</volume>
            <fpage>1077</fpage>
            <lpage>1088</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8269841</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Synaptic function is impaired but not eliminated in <it>C. elegans </it>mutants lacking synaptotagmin</p>
            </title>
            <aug>
               <au>
                  <snm>Nonet</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Grundahl</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Rand</snm>
                  <fnm>JB</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1993</pubdate>
            <volume>73</volume>
            <fpage>1291</fpage>
            <lpage>1305</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8391930</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Synaptotagmins are trafficked to distinct subcellular domains including the postsynaptic compartment</p>
            </title>
            <aug>
               <au>
                  <snm>Adolfsen</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Saraswati</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yoshihara</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Littleton</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>J Cell Biol</source>
            <pubdate>2004</pubdate>
            <volume>166</volume>
            <fpage>249</fpage>
            <lpage>260</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15263020</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Role of synaptotagmin in Ca<sup>2+ </sup>-triggered exocytosis</p>
            </title>
            <aug>
               <au>
                  <snm>Tucker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>ER</fnm>
               </au>
            </aug>
            <source>Biochem J</source>
            <pubdate>2002</pubdate>
            <volume>366</volume>
            <fpage>1</fpage>
            <lpage>13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1222778</pubid>
                  <pubid idtype="pmpid" link="fulltext">12047220</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Alternative splicing of synaptotagmins involving transmembrane exon skipping</p>
            </title>
            <aug>
               <au>
                  <snm>Craxton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goedert</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1999</pubdate>
            <volume>460</volume>
            <fpage>417</fpage>
            <lpage>422</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10556508</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Synaptotagmin I and 1B4 are identical: implications for Synaptotagmin distribution in the primate brain</p>
            </title>
            <aug>
               <au>
                  <snm>Chowdhury</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Travis</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Sutcliffe</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Burton</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Neurosci Lett</source>
            <pubdate>1995</pubdate>
            <volume>190</volume>
            <fpage>9</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7624059</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Regulation of Synaptotagmin Gene Expression during Ascidian Embryogenesis</p>
            </title>
            <aug>
               <au>
                  <snm>Katsuyama</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Matsumoto</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Okada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ohtsuka</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Okado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Okamura</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Dev Biol</source>
            <pubdate>2002</pubdate>
            <volume>244</volume>
            <fpage>293</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11944938</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Molecular cloning, expression, and characterization of a novel class of synaptotagmin (SytXIV) conserved from <it>Drosophila </it>to humans</p>
            </title>
            <aug>
               <au>
                  <snm>Fukuda</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Biochem</source>
            <pubdate>2003</pubdate>
            <volume>133</volume>
            <fpage>641</fpage>
            <lpage>649</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12801916</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>E-Syts, a family of membranous Ca<sup>2+</sup>-sensor proteins with multiple C2 domains</p>
            </title>
            <aug>
               <au>
                  <snm>Min</snm>
                  <fnm>S-W</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>W-P</fnm>
               </au>
               <au>
                  <snm>S&#252;dhof</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2007</pubdate>
            <volume>104</volume>
            <fpage>3823</fpage>
            <lpage>3828</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17360437</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Polar transport of auxin: carrier-mediated flux across the plasma membrane or neurotransmitter-like secretion?</p>
            </title>
            <aug>
               <au>
                  <snm>Balu&#352;ka</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>&#352;amaj</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Menzel</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>TRENDS Cell Biol</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>282</fpage>
            <lpage>285</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12791291</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Plant synapses: actin-based domains for cell-to-cell communication</p>
            </title>
            <aug>
               <au>
                  <snm>Balu&#352;ka</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Volkmann</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Menzel</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>TRENDS Plant Sci</source>
            <pubdate>2005</pubdate>
            <volume>10</volume>
            <fpage>106</fpage>
            <lpage>111</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15749467</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Plant neurobiology: an integrated view of plant signaling</p>
            </title>
            <aug>
               <au>
                  <snm>Brenner</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Stahlberg</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mancuso</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vivanco</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Balu&#352;ka</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Van Volkenberg</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>TRENDS Plant Sci</source>
            <pubdate>2006</pubdate>
            <volume>11</volume>
            <fpage>413</fpage>
            <lpage>418</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16843034</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>National Center for Biotechnology Information</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/BLAST</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Managing sequence projects in the GAP4 environment</p>
            </title>
            <aug>
               <au>
                  <snm>Staden</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Judge</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Bonfield</snm>
                  <fnm>JK</fnm>
               </au>
            </aug>
            <source>Introduction to Bioinformatics</source>
            <publisher>Humana Press</publisher>
            <editor>Krawetz SA, Womble DD</editor>
            <pubdate>2003</pubdate>
            <fpage>393</fpage>
            <lpage>410</lpage>
         </bibl>
         <bibl id="B21">
            <title>
               <p>DOE Joint Genome Institute</p>
            </title>
            <url>http://www.jgi.doe.gov/sequencing/why/CSP2005/physcomitrella.html</url>
         </bibl>
         <bibl id="B22">
            <title>
               <p>NCBI Trace Archive</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/Traces</url>
         </bibl>
         <bibl id="B23">
            <title>
               <p>PHYSCObase</p>
            </title>
            <url>http://moss.nibb.ac.jp/</url>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Multiple sequence alignment with hierarchical clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Corpet</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>1988</pubdate>
            <volume>16</volume>
            <fpage>10881</fpage>
            <lpage>10890</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">338945</pubid>
                  <pubid idtype="pmpid" link="fulltext">2849754</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>HUGO gene nomenclature committee</p>
            </title>
            <url>http://www.genenames.org</url>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Molecular determinants and guided evolution of species-specific RNA editing</p>
            </title>
            <aug>
               <au>
                  <snm>Reenan</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>434</volume>
            <fpage>409</fpage>
            <lpage>413</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15772668</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Identification and characterization of a novel C2B splice variant of synaptotagmin I</p>
            </title>
            <aug>
               <au>
                  <snm>Nakhost</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Houeland</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Blandford</snm>
                  <fnm>VE</fnm>
               </au>
               <au>
                  <snm>Castellucci</snm>
                  <fnm>VF</fnm>
               </au>
               <au>
                  <snm>Sossin</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>J Neurochem</source>
            <pubdate>2004</pubdate>
            <volume>89</volume>
            <fpage>354</fpage>
            <lpage>363</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15056279</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Identification of a novel protein containing two C2 domains selectively expressed in the rat brain and kidney</p>
            </title>
            <aug>
               <au>
                  <snm>Kwon</snm>
                  <fnm>OJ</fnm>
               </au>
               <au>
                  <snm>Gainer</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wray</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chin</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1996</pubdate>
            <volume>378</volume>
            <fpage>135</fpage>
            <lpage>139</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8549819</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Orthologs, Paralogs, and Evolutionary Genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>2005</pubdate>
            <volume>39</volume>
            <fpage>309</fpage>
            <lpage>338</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16285863</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Diverse membrane-associated proteins contain a novel SMP domain</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>FASEB J</source>
            <pubdate>2006</pubdate>
            <volume>20</volume>
            <fpage>202</fpage>
            <lpage>206</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16449791</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Wormbase</p>
            </title>
            <url>http://www.wormbase.org</url>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Flybase</p>
            </title>
            <url>http://flybase.bio.indiana.edu/</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Patterns of sequence conservation in presynaptic neural genes</p>
            </title>
            <aug>
               <au>
                  <snm>Hadley</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Valladares</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Hannenhalli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ungar</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bu&#263;an</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>R105</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1794582</pubid>
                  <pubid idtype="pmpid" link="fulltext">17096848</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Genome re-annotation: a wiki solution</p>
            </title>
            <aug>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>102</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1839116</pubid>
                  <pubid idtype="pmpid" link="fulltext">17274839</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>The structure of the protein universe and genome evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Karev</snm>
                  <fnm>GP</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>218</fpage>
            <lpage>223</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12432406</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Modules, multidomain proteins and organismic complexity</p>
            </title>
            <aug>
               <au>
                  <snm>Tordai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nagy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Farkas</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Banyi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Patthy</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>FEBS J</source>
            <pubdate>2005</pubdate>
            <volume>272</volume>
            <fpage>5064</fpage>
            <lpage>5078</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16176277</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Comparative genomics and structural biology of the molecular innovations of eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Curr Op in Struct Biol</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>409</fpage>
            <lpage>419</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid> 16679012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Functional Recycling of C2 domains Throughout Evolution: A Comparative Study of Synaptotagmin, Protein Kinase C and Phospholipid C by Sequence, Structural and Modelling Approaches</p>
            </title>
            <aug>
               <au>
                  <snm>Jim&#233;nez</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Contreras-Moreira</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sgouros</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Meunier</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Schiavo</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>333</volume>
            <fpage>621</fpage>
            <lpage>639</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14556749</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The evolution of spliceosomal introns: patterns, puzzles and progress</p>
            </title>
            <aug>
               <au>
                  <snm>Roy</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>211</fpage>
            <lpage>221</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16485020</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>The ancient Virus World and evolution of cells</p>
            </title>
            <aug>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Senkevich</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Dolja</snm>
                  <fnm>VV</fnm>
               </au>
            </aug>
            <source>Biol Direct</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <fpage>29</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1594570</pubid>
                  <pubid idtype="pmpid" link="fulltext">16984643</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>The evolutionary significance of cis-regulatory mutations</p>
            </title>
            <aug>
               <au>
                  <snm>Wray</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>206</fpage>
            <lpage>216</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17304246</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>A new paradigm for developmental biology</p>
            </title>
            <aug>
               <au>
                  <snm>Mattick</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>J Exp Biol</source>
            <pubdate>2007</pubdate>
            <volume>210</volume>
            <fpage>1526</fpage>
            <lpage>1547</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17449818</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Origins and Evolution of Spliceosomal Introns</p>
            </title>
            <aug>
               <au>
                  <snm>Rodr&#237;guez-Trelles</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Tarr&#237;o</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ayala</snm>
                  <fnm>FJ</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>2006</pubdate>
            <volume>40</volume>
            <fpage>47</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17094737</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: Implications for land plant evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Nishiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fujita</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shin-I</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Seki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nishide</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Uchiyama</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kamiya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Shinozaki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kohara</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hasebe</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>8007</fpage>
            <lpage>8012</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">164703</pubid>
                  <pubid idtype="pmpid" link="fulltext">12808149</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
