<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2156-6-26</ui>
	<ji>1471-2156</ji>
	<fm>
		<dochead>Research article</dochead>
		<bibl>
			<title>
				<p>Conserved genomic organisation of Group B Sox genes in insects.</p>
			</title>
			<aug>
				<au id="A1">
					<snm>McKimmie</snm>
					<fnm>Carol</fnm>
					<insr iid="I1"/>
					<email>cm10019@mole.bio.cam.ac.uk</email>
				</au>
				<au id="A2">
					<snm>Woerfel</snm>
					<fnm>Gertrud</fnm>
					<insr iid="I1"/>
					<email>gw236@mole.bio.cam.ac.uk</email>
				</au>
				<au id="A3" ca="yes">
					<snm>Russell</snm>
					<fnm>Steven</fnm>
					<insr iid="I1"/>
					<email>s.russell@gen.cam.ac.uk</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK</p>
				</ins>
			</insg>
			<source>BMC Genetics</source>
			<issn>1471-2156</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>1</issue>
			<fpage>26</fpage>
			<url>http://www.biomedcentral.com/1471-2156/6/26</url>
			<xrefbib>
				<pubidlist>
					<pubid idtype="pmpid">15943880</pubid>
					<pubid idtype="doi">10.1186/1471-2156-6-26</pubid>
				</pubidlist>
			</xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>27</day>
					<month>1</month>
					<year>2005</year>
				</date>
			</rec>
			<acc>
				<date>
					<day>19</day>
					<month>5</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>19</day>
					<month>5</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>McKimmie et al; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p><it>Sox </it>domain containing genes are important metazoan transcriptional regulators implicated in a wide rage of developmental processes. The vertebrate B subgroup contains the <it>Sox1</it>, <it>Sox2 and Sox3 </it>genes that have early functions in neural development. Previous studies show that <it>Drosophila </it>Group B genes have been functionally conserved since they play essential roles in early neural specification and mutations in the <it>Drosophila Dichaete </it>and <it>SoxN </it>genes can be rescued with mammalian <it>Sox </it>genes. Despite their importance, the extent and organisation of the Group B family in <it>Drosophila </it>has not been fully characterised, an important step in using <it>Drosophila </it>to examine conserved aspects of Group B <it>Sox </it>gene function.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>We have used the directed cDNA sequencing along with the output from the publicly-available genome sequencing projects to examine the structure of Group B <it>Sox </it>domain genes in <it>Drosophila melanogaster</it>, <it>Drosophila pseudoobscura, Anopheles gambiae </it>and <it>Apis mellifora</it>. All of the insect genomes contain four genes encoding Group B proteins, two of which are intronless, as is the case with vertebrate group B genes. As has been previously reported and unusually for Group B genes, two of the insect group B genes, <it>Sox21a </it>and <it>Sox21b</it>, contain introns within their DNA-binding domains. We find that the highly unusual multi-exon structure of the <it>Sox21b </it>gene is common to the insects. In addition, we find that three of the group B <it>Sox </it>genes are organised in a linked cluster in the insect genomes. By <it>in situ </it>hybridisation we show that the pattern of expression of each of the four group B genes during embryogenesis is conserved between <it>D. melanogaster </it>and <it>D. pseudoobscura</it>.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>The DNA-binding domain sequences and genomic organisation of the group B genes have been conserved over 300 My of evolution since the last common ancestor of the Hymenoptera and the Diptera. Our analysis suggests insects have two Group B1 genes, <it>SoxN </it>and <it>Dichaete</it>, and two Group B2 genes. The genomic organisation of <it>Dichaete </it>and another two Group B genes in a cluster, suggests they may be under concerted regulatory control. Our analysis suggests a simple model for the evolution of group B Sox genes in insects that differs from the proposed evolution of vertebrate Group B genes.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>The family of Sox-domain containing proteins encompass a group of metazoan transcriptional regulators first identified by their similarity with the mammalian testis-determining factor SRY. Membership of the Sox family is conferred by the presence of an HMG1-type DNA-binding domain sharing greater than 60% amino-acid sequence identity to that of SRY <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Mammalian genome sequencing projects indicate that in humans and mice there are twenty <it>Sox </it>genes <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, divided into eight subgroups (A-H) on the basis of sequence identity within and outwith the HMG-domain. Aside from mammals, <it>Sox </it>genes have been identified in all metazoans examined to date, including birds, fish amphibians, basal chordates, insects and nematodes <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
			<p>The B subgroup is of particular interest since members of this group are most closely related to SRY and appear to be functionally conserved during evolution. Sequence analysis and functional studies suggest that, in vertebrates, the five members of the B subgroup can be subdivided into two further groups; B1; <it>Sox1</it>, <it>Sox2 </it>and <it>Sox3</it>; <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and B2; <it>Sox14 </it>and <it>Sox21</it>; <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. It has been suggested from studies in the chick that the three group B1 proteins act as gene activators whereas the B2 proteins act as gene repressors <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. In terms of genomic organization, all five of the group B genes are devoid of introns. <it>Sox3 </it>is located on the mammalian <it>X </it>chromosome and is believed to be the ancestor of <it>Sry </it><abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. In humans, the remaining four autosomal group B genes are arranged in two pairs, each comprising one B1 gene and one B2 gene: <it>Sox2 </it>and <it>Sox14 </it>map together on chromosome 3 <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp> and <it>Sox1 </it>and <it>Sox21 </it>map together on chromosome 13 <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B11">11</abbr></abbrgrp>. This organization is conserved, at least in part, in other vertebrates with <it>Sox2-Sox14 </it>mapping together in the chick and the monotreme, <it>O. anatinus</it>, and <it>Sox1</it>-<it>Sox21 </it>mapping together in the chick <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. There is, however, no linkage of Group B <it>Sox </it>genes in the mouse genome <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. A model suggesting the evolution of group B genes and <it>Sry </it>from a single ancestor has been proposed, which suggests that pairs of B1 and B2 genes arose by a tandem duplication and then a chromosomal duplication <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
			<p>The fruitfly, <it>Drosophila melanogaster</it>, has proved to be a tractable system for studying conserved aspects of eukaryotic gene function and, with the production of other insect genome sequences, a useful baseline for evolutionary studies of gene organisation <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Whole-genome sequence is now available for three insects, <it>Drosophila melanogaster, Drosophila pseudoobscura </it>(which diverged from <it>melanogaster </it>some 46 million years ago) and <it>Anopheles gambiae</it>, which diverged from <it>melanogaster </it>approximately 250 million years ago <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. Sequencing and assembly of a further ten <it>Drosophila </it>species is currently underway <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> promising an unparalleled data source for evolutionary studies. In addition to the diptera, the sequencing of the Hymenoptera, <it>Apis mellifera </it>(honey bee ~280 million years from <it>Drosophila</it>), is now well underway, allowing fragments of a fourth insect genome to be assessed. In functional terms, <it>Drosophila </it>is a useful model for studying <it>SOX </it>gene function due to its genetic tractability. For example, we have previously shown that, in the case of the <it>Drosophila </it>group B gene <it>Dichaete</it>, there is functional conservation between insect and mammalian genes <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. In addition, we, and others, have demonstrated a degree of <it>in vivo </it>functional redundancy between <it>Dichaete </it>and <it>SoxN </it><abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp> as had been proposed for the mammalian group B genes <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Of particular interest is the fact that the expression patterns and functional studies of group B genes suggest that they participate in the earliest events of CNS differentiation in all organisms that have been studied to date including <it>Drosophila</it>, <it>Xenopus</it>, chick, mouse, ascidians and hemichordates <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
			<p>To further explore the relationship between group B <it>Sox </it>genes we examined the extent and organization of the family in insects. Our studies show that group B <it>Sox </it>gene organisation is similar in four different insects. We find conservation in the sequence and genome organization of the group B genes in <it>D. melanogaster</it>, <it>D. pseudoobscura</it>, <it>A. gambiae </it>and <it>A. melifora</it>. In contrast to mammals and in agreement with a previous report <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, we find that two group B2 genes contain introns and are organized as a single genomic cluster along with the intronless <it>Dichaete </it>gene. Our studies indicate a potentially different evolutionary path for members of the group B family in insects and vertebrates.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<p>To explore the structure of the group B <it>Sox </it>genes in insects we first accurately determined the extent and structure of the family in <it>Drosophila melanogaster</it>. The group B genes, <it>Dichaete </it>and <it>SoxNeuro </it>(<it>SoxN</it>) have already been well described in the literature <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. Two other group B gene fragments have been identified <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, <it>Sox21a </it>and <it>Sox21b</it>, but their structure and genomic organisation have not been reported. Using a combination of database searching and DNA sequencing we characterised both of these genes in detail. We find no evidence for any other group B genes in Release 3.2 or Release 4 of the <it>Drosophila </it>genome sequence, indicating that there are a total of four in the <it>D. melanogaster </it>genome.</p>
			<sec>
				<st>
					<p>Sox21a</p>
				</st>
				<p>Blast searches of the <it>Drosophila </it>genome identified a group B HMG-domain interrupted by a 1655-bp intron in the 70D region of chromosome arm <it>3L</it>. Using primers designed against each of these predicted exons we amplified a fragment of 1238 bp from the LD cDNA library produced by the BDGP <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. For reasons that are as yet unclear, we have been unable to recover a clone from this or any of the several other cDNA libraries that we have screened. The fragment amplified from the library was sequenced in its entirety and found to contain a long open reading frame encoding a 389 amino acid Sox domain protein. The predicted polypeptide initiates with a methionine and probably contains the entire coding sequence for the gene. When aligned with the genome sequence we predict a gene with two exons spanning 2.8 kb. Blast searches with the predicted protein find over 90% identity with a range of group B Sox proteins in the HMG DNA-binding domain. The best scores are with the DNA-binding domains of the vertebrate Sox21 and Sox14 proteins; however, there is little significant similarity outside of the DNA-binding domain. The <it>Sox21a </it>gene has previously been reported as <it>SoxB2-3 </it>(<it>CG7345</it>) and it has been suggested that it may represent a pseudogene <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. As we show below, RT-PCR and <it>in situ </it>hybridisation studies indicate that <it>Sox21a </it>is expressed in both <it>D. melanogaster </it>and <it>D. pseudoobscura </it>indicating that it is not a pseudogene.</p>
			</sec>
			<sec>
				<st>
					<p>Sox21b</p>
				</st>
				<p>Along with <it>Sox21a</it>, Blast searches indicated a second interrupted Sox domain in the same region of the genome. In this case, database searches found a potential cDNA clone from the BDGP (GH07353), which was obtained and sequenced in its entirety. The sequence of the clone revealed a long open reading frame, initiating with a methionine, encoding a predicted polypeptide of 571 amino acids. Alignment with the genome sequence indicates that the gene spans 19 kb of genomic DNA and is composed of 7 exons, the first of which is non-coding. The DNA-binding domain contains two introns; the first, 6388 bp in size, is in the middle of the DNA-binding domain and the second, of 59 bp, is in the same position and frame as the <it>Sox21a </it>intron described above. Blast searches with the predicted amino acid sequence find greater than 90% amino acid identity the DNA-binding domains of group B Sox proteins, the highest scores being with <it>Dichaete</it>. The sequence indicates that <it>Sox21b </it>corresponds to the <it>SoxB2-2 </it>gene fragment previously reported <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and whole mount <it>in situ </it>hybridisation with probes derived from genomic DNA and the cDNA clone confirm the pattern of expression previously reported (Figure <figr fid="F2">2</figr>). Thus, both <it>Sox21a </it>and <it>Sox21b </it>are expressed group B <it>Sox </it>genes that have their DNA binding domains interrupted by introns.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Full alignments of the insect Group B protein sequences</p>
					</caption>
					<text>
						<p>Full alignments of the insect Group B protein sequences. <b>A) </b>SoxN. <b>B) </b>Dichaete. <b>C) </b>Sox21a, the position of the conserved intron is indicated with an arrow. <b>D) </b>Sox21b, the location of the exons in the <it>D. melanogaster </it>sequence in indicated in italics above the alignment. Black arrowheads above the alignment indicate positions of introns conserved in all four species. The grey arrowheads indicate intron positions conserved in the diptera. The black arrow above the alignment indicates the <it>drosophila </it>specific intron and the grey arrows below the alignment indicates the apis-specific introns.</p>
					</text>
					<graphic file="1471-2156-6-26-1"/>
				</fig>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Embryonic expression of group B genes in <it>D. melanogaster </it>and <it>D. </it>pseudoobscura</p>
					</caption>
					<text>
						<p>Embryonic expression of group B genes in <it>D. melanogaster </it>and <it>D. </it>pseudoobscura. A-H, <it>D. melanogaster</it>. A'-H', <it>D. pseudoobscura</it>; anterior is to the left in all cases. <b>A-A') </b>Lateral view of stage 5 embryos showing expression of <it>Dichaete </it>in a central domain and the cephalic neuroectoderm. <b>B-B') </b>Lateral views of stage 8 embryos showing extensive <it>Dichaete </it>expression in the developing CNS. <b>C-C') </b>Ventral view of stage 5 embryos showing <it>SoxN </it>expression is restricted from the presumptive mesoderm. <b>D-D') </b>Dorsal view of stage 8 embryos showing SoxN expression in the CNS. <b>E-E') </b>Lateral view of stage 9 embryos showing <it>Sox21a </it>expression in the anlage of the foregut and hindgut. <b>F-F') </b>Ventral views of stage 14 embryos showing <it>Sox21a </it>expression restricted to specific cells in the midline. <b>G-G') </b>Lateral views of stage 13 embryos showing <it>Sox21b </it>expression in abdominal epidermal stripes. <b>H-H') </b>Ventral view of stage 14 embryos showing <it>Sox21b </it>expression in abdominal epidermal stripes.</p>
					</text>
					<graphic file="1471-2156-6-26-2"/>
				</fig>
				<p>To verify the gene predictions and gain some insight into their possible biological functions we determined the developmental expression of each of the four group B genes by RT-PCR using RNA templates isolated from different stages of the <it>Drosophila </it>lifecycle. In the case of <it>Sox21a </it>and <it>Sox21b</it>, we used primers to the Sox-domain encoding exons spanning a predicted intron. All RT-PCR reactions included a reverse transcriptase minus reaction and the amplified products were verified by sequencing. The results of this analysis are presented in Table <tblr tid="T1">1</tblr> and can be summarised as follows: the expression profiles of <it>Dichaete </it>and <it>SoxN </it>are very similar, they are expressed during embryonic, larval and pupal stages of development, the level of expression reducing during the later stages of pupal development. Both genes are expressed in adult male as well as female flies, with bodies showing stronger expression than heads. <it>Sox21a </it>is expressed throughout development and in adults it is stronger in heads than in bodies. In contrast to the other group B genes, <it>Sox21b </it>has a more complex expression pattern during development. It is strongly expressed in embryos but is below detectable levels for much of larval and pupal life. After eclosion it is weakly expressed in male heads but not bodies. Thus, in common with mammalian group B genes, all four <it>D. melanogaster </it>group B <it>Sox </it>are expressed during embryogenesis and at other stages throughout development.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>representation of <it>Sox </it>expression in during <it>Drosophila </it>development assayed by RT-PCR.</p>
					</caption>
					<tblbdy cols="5">
						<r>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>
									<b>
										<it>Dichaete</it>
									</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>
										<it>SoxNeuro</it>
									</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>
										<it>sox21a</it>
									</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>
										<it>sox21b</it>
									</b>
								</p>
							</c>
						</r>
						<r>
							<c cspan="5">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Embryo</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>1<sup>st </sup>instar</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>2<sup>nd </sup>instar</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Early 3<sup>rd </sup>instar</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Late 3<sup>rd </sup>instar</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Prepupa</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>12 h pupa</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>36 h pupa</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Heads</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Bodies</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Male</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>(+)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Female</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>+</p>
							</c>
							<c ca="center">
								<p>-</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>+ = expression, - = no expression, (+) = weak expression.</p>
					</tblfn>
				</tbl>
				<sec>
					<st>
						<p>Group B genes in other insects</p>
					</st>
					<p>Our findings show that <it>Drosophila melanogaster </it>has four group B <it>Sox </it>genes compared to the five found in vertebrates and, unlike vertebrates, two of the genes contain introns. To investigate whether this particular organization is unique to <it>D. melanogaster </it>we searched the available genome sequence of other insects to find potential <it>Sox </it>domain genes. Using the <it>Dichaete </it>DNA-binding domain as a query, we searched the <it>Drosophila pseudoobscura, Anopheles gambiae </it>and <it>Apis mellifera </it>genome and EST sequence databases using Blast-P and Blast-N (see materials and methods for EST and genome scaffold accessions). In all three cases we found evidence for four Group B genes and were able to build gene models, from the genome sequence alone or with the addition of EST data where available. The initial characterization of the insect group B genes, based on the HMG-domain sequence, suggests that there is a single orthologue of each <it>Drosophila </it>gene in the other three species.</p>
				</sec>
			</sec>
			<sec>
				<st>
					<p>SoxN</p>
				</st>
				<p>The alignment presented in Figure <figr fid="F1">1a</figr>. shows the similarity between the insect SoxN proteins and mouse Sox1. As previously reported, conservation between vertebrate and invertebrate Sox proteins is mostly restricted to the DNA-binding domains <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. Between the insect proteins there are more extensive regions of homology outwith the DNA-binding domain. The <it>Drosophila </it>SoxN sequences show over 90% sequence identity over their entire length and, as expected from the phylogenies based on rDNA and protein coding sequences, the other insect sequences are more diverged <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B30">30</abbr></abbrgrp>. <it>A. gambiae </it>is overall 64% identical to the <it>melanogaster </it>sequence with particularly well conserved regions in the N-terminal 50 amino acids and more patchy conservation C-terminal to the DNA-binding domain. <it>A. mellifera </it>is further diverged (52% identity with <it>Drosophila</it>). Conserved regions outside the DNA-binding domains among all four sequences are restricted to a stretch of amino acids C-terminal that may represent conserved functional motifs important in transcriptional regulation.</p>
			</sec>
			<sec>
				<st>
					<p>Dichaete</p>
				</st>
				<p>The situation with Dichaete is similar to that observed with SoxN, and the figures for amino acid identity are virtually identical (Figure <figr fid="F1">1b</figr>). Outside of the DNA-binding domain the Dichaete sequences show even less similarity comparing the <it>Drosophila </it>species and the other two insects; conservation between all four being restricted to limited regions C-terminal to the DNA-binding domain. Interestingly, we have shown that the C-terminal region of <it>D. melanogaster </it>Dichaete contains sequences required for activity in a context-specific manner <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and C-terminal regions of the mouse and chicken Sox2 protein are believed to be involved in aspects of correct Sox2 function <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Sox21a</p>
				</st>
				<p>This gene is the least conserved between the four species and outside of the DNA-binding domain they show little similarity with vertebrate group B2 proteins (Figure <figr fid="F1">1c</figr>). There is extensive homology between the two <it>Drosophila </it>species, however, the <it>Anopheles </it>and <it>Apis </it>sequences are very diverged outside of the DNA binding domain. As with <it>D. melanogaster</it>, there are no EST sequences available that support the structure of <it>Sox21a </it>in the other insects.</p>
			</sec>
			<sec>
				<st>
					<p>Sox21b</p>
				</st>
				<p>The predicted <it>Drosophila Sox21b </it>proteins are again very similar, over 88% identical over their length. The other insect sequences are less well conserved, although the <it>Anopheles </it>sequence has a block of conservation C-terminal to the DNA-binding domain, including a Glutamic acid-rich domain (Figure <figr fid="F1">1d</figr>). The predicted <it>Apis </it>sequence is less well conserved, we note, however, that all four proteins are identical at the extreme C-terminus. With both the <it>Anopheles </it>and <it>Apis </it>proteins we cannot confidently predict the N-terminal exons and are unable to find any regions with amino acid similarity to the first 2 coding exons of the <it>Drosophila </it>sequences in the <it>Anopheles </it>or <it>Apis </it>genomic sequence between the end of <it>Dichaete </it>and the <it>Sox21b </it>Sox-domain encoding exons. Our current models are, however, supported by the available EST sequences for both species although the EST sequences are not full-length. Therefore, the definitive structure of these two insect <it>Sox21b </it>genes will require further investigation. Nevertheless, it is clear from the available sequence that orthologues of <it>Sox21b </it>are present in other insects.</p>
				<p>To confirm the identification of four group B genes in both <it>D. melanogaster </it>and <it>D. pseudoobscura</it>, we performed whole-mount <it>in situ </it>hybridization to embryos of both species using exon-specific probes generated by PCR from genomic DNA. In all four cases we find very similar patterns of expression during embryogenesis. In the case of <it>Dichaete</it>, we find blastoderm expression including a broad central domain and a region of expression in the cephalic neuroectoderm (Figure <figr fid="F2">2A</figr> and <figr fid="F2">2A'</figr>). After gastrulation there is extensive expression in the developing CNS (Figure <figr fid="F2">2B</figr> and <figr fid="F2">2B'</figr>) including the midline (not shown). With <it>SoxN </it>we find conserved blastoderm expression, including an identical restriction from the ventral region of the embryo, followed by extensive expression throughout the developing CNS (Figure <figr fid="F2">2C</figr> to <figr fid="F2">2D'</figr>). With <it>Sox21a</it>, we identified conserved expression in the anlage of the foregut and hindgut at stage 12 (Figure <figr fid="F2">2E</figr> and <figr fid="F2">2E'</figr>) with later expression in specific cells of the midline after stage 14 (Figure <figr fid="F2">2F</figr> and <figr fid="F2">2F'</figr>). <it>Sox21b </it>shows conserved expression in abdominal epidermal stripes from stage 13 (Figure <figr fid="F2">2G</figr> to <figr fid="F2">2H'</figr>). These observations indicate that all four group B genes have conserved expression patterns during embryogenesis.</p>
				<sec>
					<st>
						<p>Genomic organisation of group B genes in <it>Drosophila</it>: the Dichaete complex</p>
					</st>
					<p>In some vertebrates the two classes of group B genes, B1 and B2, are linked on the same chromosome. In contrast, with <it>Drosophila </it>a single gene, <it>SoxN</it>, maps to the second chromosome and the remaining three all map to chromosome 3. We examined the organisation of the group B genes in the other insect genomes and found that the situation was very similar to that observed in <it>Drosophila</it>. In <it>melanogaster</it>, <it>SoxN </it>is intronless and sits alone in the middle of an 80 Kb island with no flanking genes for 35 Kb proximal and 45 Kb distal, an unusual organisation for a <it>Drosophila </it>gene. We have previously shown that <it>Dichaete </it>is controlled by extensive 3' regulatory sequences, suggesting that perhaps the paucity of genes flanking <it>SoxN </it>may also indicate the presence of extensive regulatory sequences. In support of this, we find several clusters of predicted transcription factor binding sites from 35 kb upstream to 20 kb downstream of <it>SoxN </it>when we use a stringent search criteria with <it>Cis</it>Analyst analysis software <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp> (data not shown). Similar searches with <it>Dichaete </it>find previously identified regulatory sequences, suggesting that <it>SoxN </it>may indeed be subject to complex regulation. Comparative analysis of the <it>melanogaster </it>and <it>pseudoobscura </it>genomes with the <it>Vista </it>genome alignment viewer <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp> indicates that the genomic organization is very similar in the two species. The Ensemble annotation of the <it>Anophelese </it>genome indicates that the region around <it>SoxN </it>is also sparsely populated, with only 2 short stretches of EST homology in the 150 kb flanking <it>SoxN</it>. Therefore, it is possible that <it>SoxN </it>is subject to complex regulatory control in <it>Anopheles</it>. There is currently insufficient contiguous genomic sequence from <it>Apis </it>to assess the organization of the <it>SoxN </it>region.</p>
					<p>In the 70D region of <it>Drosophila melanogaster </it>chromosome arm <it>3L </it>the remaining three group B genes are clustered within an 77 kb region (Fig <figr fid="F3">3</figr>). As we have previously reported, <it>Dichaete </it>is an intronless gene controlled by at least 30 Kb of regulatory sequence 3' to the transcription unit <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. 16 kb further distal to these regulatory sequences we find the start of <it>Sox21b </it>and a further 28 kb distal to this the start of <it>Sox21a</it>. The region ends with the <it>Fat Body Protein 1 </it>(<it>Fbp1</it>) gene 6 kb downstream of <it>Sox21a</it>. The genomic organization of <it>Sox21b </it>is highly unusual for a group B <it>Sox </it>gene. It is split into seven exons, the first of which is non-coding and exons 3, 4 and 5 contain the DNA-binding domain. All of the predicted splice junctions have consensus GT-AG sequences. <it>Sox21a</it>, comprises 2 exons, each containing a portion of the DNA-binding domain. As we note above, the position of the intron, which has consensus splice junction sequences, is in the same position as the second DNA-binding domain intron of <it>Sox21b </it>(intron 4).</p>
					<fig id="F3">
						<title>
							<p>Figure 3</p>
						</title>
						<caption>
							<p>The genomic organisation of the insect <it>Dichaete </it>regions</p>
						</caption>
						<text>
							<p>The genomic organisation of the insect <it>Dichaete </it>regions. Exons are represented by shaded boxes and introns by the linking lines. A scale bar of 2 kb is indicated. The <it>melanogaster </it>and <it>pseudoobscura </it>sequences are to scale, the larger distance between <it>Dichaete </it>and <it>Sox21b </it>in <it>Anopheles </it>and <it>Apis </it>is indicated by a break in the line, the remainder of the diagram is to scale.</p>
						</text>
						<graphic file="1471-2156-6-26-3"/>
					</fig>
					<p>In the case of <it>D. pseudoobscura</it>, the homology extends from upstream of the <it>Dichaete </it>coding region to at least the <it>Fat body protein 1 </it>gene downstream of <it>Sox21a</it>. The organization of the three <it>Sox </it>genes is virtually identical comparing the two species and we could construct gene models including all of the <it>Sox21a </it>and <it>Sox21b </it>exons. There is absolute conservation of the intron position between both <it>Drosophila </it>species, furthermore, the sizes of the introns is also similar, although nucleotide similarity is lower than in coding sequences ranging from 40 &#8211; 75%. As with <it>melanogaster</it>, we find no evidence for additional genes in the intergenic region between <it>Dichaete </it>and <it>Sox21b</it>. We used the OWEN sequence alignment programme to plot the conservation between the <it>Dichaete </it>&#8211; <it>Sox21b </it>intergenic region in both species (Figure <figr fid="F4">4</figr>). Throughout the entire region we see that there is a high degree of sequence conservation, since we know that at least 30 kb of this region contains essential <it>Dichaete </it>regulatory sequences in <it>melanogaster</it>, we predict that regulation in the region will be similar in both species. A suggestion supported by the <it>in situ </it>hybridization data presented above (Figure <figr fid="F2">2</figr>).</p>
					<fig id="F4">
						<title>
							<p>Figure 4</p>
						</title>
						<caption>
							<p>OWEN alignment of the region between <it>Dichaete </it>and <it>Sox21b </it>in <it>D. melanogaster </it>and <it>D. pseudoobscura </it>showing extensive sequence similarity throughout the 45 kb region</p>
						</caption>
						<text>
							<p>OWEN alignment of the region between <it>Dichaete </it>and <it>Sox21b </it>in <it>D. melanogaster </it>and <it>D. pseudoobscura </it>showing extensive sequence similarity throughout the 45 kb region.</p>
						</text>
						<graphic file="1471-2156-6-26-4"/>
					</fig>
					<p>The organization of the <it>Dichaete </it>region in the <it>Anopheles </it>genome is very similar to that in the <it>Drosophila </it>species with three genes found in a 190 kb region of chromosome arm <it>3L</it>. <it>Dichaete </it>is intronless and <it>Sox21b </it>is located approximately 110 kb downstream of this. There are no other predicted genes in the region. The <it>Sox21b </it>has a similar structure to those of the <it>Drosophilids</it>, however, it is not identical. We have been unable to find a 5' non-coding exon and, as we note above, the second intron found in the DNA-binding domains of the <it>Drosophila Sox21b </it>genes is absent in <it>Anopheles </it>with exons 4 and 5 fused. The other introns are, however, conserved in position (figure <figr fid="F1">1d</figr>). With the <it>Anopheles Sox21a </it>gene, the single intron position is conserved with the <it>Drosophila </it>species, however, the intron is considerably larger and contains an insertion of a Q-class retrotransposon in the sequenced strain <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. We find no evidence for an <it>Fbp1 </it>orthologue in the vicinity, the nearest similar sequence being some 5 Mb away on the same chromosome arm.</p>
					<p>The available sequence in the region is more fragmentary in the case of <it>Apis</it>. Here we find an intronless <it>Dichaete </it>gene and can define two sets of exons corresponding to the split DNA binding domains of <it>Sox21a </it>and <it>Sox21b</it>. Overall, the organization is similar to the other insects; like <it>Anopheles</it>, the intergenic region between <it>Dichaete </it>and <it>Sox21b </it>is large (~90 kb), however, unlike the other insects the distance between <it>Sox21b </it>and <it>Sox21a </it>is also large (~80 kb). In the case of <it>Sox21b </it>we have used EST sequence to support the gene model we have derived. The EST confirms the first four exons and we predict the terminal exon on the basis of homology with the other species, particularly the terminal 30 amino acids. As with <it>Anopheles</it>, the <it>Apis Sox21b </it>gene has a single DNA-binding domain intron in the same position of the first <it>Drosophila </it>DNA-binding domain intron. The intron immediately downstream of the DNA-binding domain is also conserved in all four insects, however, the remaining two intron positions differ between <it>Apis </it>and the other insects. Although the <it>Apis </it>assembly is preliminary in this region, with several gaps still present in the sequence, the fact that the gene models are very similar to the other insects and that <it>Dichaete </it>and <it>Sox21b </it>predictions are supported by EST data suggests that the gene models we propose are likely to be accurate for the majority of the coding sequence.</p>
					<p>We compared the <it>Dichaete </it>to <it>Sox21b </it>intergenic regions of <it>Anopheles </it>and <it>Apis </it>to the <it>melanogaster </it>sequence with the OWEN alignment tool and failed to detect any significant stretches of similarity, even at relatively low stringency. This suggests that if there is conservation in gene regulatory sequences between these diverged insects it may be difficult to detect or have undergone extensive rearrangement.</p>
				</sec>
				<sec>
					<st>
						<p>Evolutionary perspective on insect group B genes</p>
					</st>
					<p>To attempt a classification of the insect group B <it>Sox </it>genes, we performed a multiple sequence alignment with the DNA-binding domains of the predicted proteins along with representative group B-like sequences from other organisms (Figure <figr fid="F5">5</figr>). The aligned ClustalX output suggests that the insect Sox domains may be subdivided into 3 classes. The first clearly groups the SoxN proteins from each of the insects with the mammalian Sox1, 2 and 3 proteins. Along with these we find representative sequences from nematodes (<it>C. elegans</it>, <it>S. ratti</it>, and <it>W. bancrofti</it>), hemichordate Acorn worms (<it>S. kowalevski </it>and <it>P. flava</it>) and the sea squirt (<it>H. roretzi</it>). Thus, together these are likely to represent a single class, orthologous to vertebrate group B1 proteins. The second class, the Sox21a proteins, have sequences similar to the mammalian group B2 proteins, Sox14 and 21 and may represent an insect group B2 protein. The third class, containing <it>Dichaete </it>and <it>Sox21b</it>, are clearly differentiated from all other group B proteins by the presence of a Leucine/Isoleucine residue at position 18, an Isoleucine residue at position 23 and a divergent set of C-terminal amino acids. These two insect proteins may represent an insect-specific group B family. This suggests that a single group B1 protein, represented by SoxN-Sox3 like sequences, was present in a common ancestor before the divergence of vertebrates and invertebrates. Similarly, the close association of the insect Sox21a proteins with nematode and vertebrate Sox14 proteins suggests that these were also present in a common ancestor. The alignments clearly highlight the distinction between the Dichaete-Sox21b pair and other group B proteins, emphasizing a distinct evolutionary history for these proteins in the insects.</p>
					<fig id="F5">
						<title>
							<p>Figure 5</p>
						</title>
						<caption>
							<p>Group B Sox-domain alignment</p>
						</caption>
						<text>
							<p>Group B Sox-domain alignment. Clustal X alignment of DNA-binding domain sequences from the insect proteins and representative group B proteins from other species. The insect sequences are highlighted in grey. Accession numbers of protein sequences are as follows: SOX15 Human, O60248; SOX15 Mouse, P43267; Dichaete <it>melanogaster</it>, Q24533; Dichaete <it>pseudoobscura</it>, TR; Dichaete <it>Anophelese</it>, TR; Sox21b <it>melanogaster</it>, Q9VUD3; Sox21b <it>pseudoobscura</it>, TR; Sox21b <it>Anophelese</it>, TR; Sox21b <it>Apis</it>, TR; Dichaete <it>Apis</it>, TR; SOX1 Human, O00570; SOX1 Mouse, P53783; SOX1 Chicken, O57401; SOX3 Human, P41225; SOX3 Mouse, P53784; SOX3 Chicken, P48433; SOX3 <it>Xenopus</it>, P55863; SOX3 Medaka, Q9PT76; SOX2 Chicken, P48430; SOX2 <it>Xenopus</it>, O42569; SOX2 Human, P48431; SOX2 Mouse, P48432; SOX2 Sheep, P54231; SOX1 <it>S. kowalevski</it>, Q7YTD4; SOXB1 <it>P. flava</it>, (Taguchi <it>et. al. </it>2002); SOXB1 Sea Urchin, Q9Y0D7; SoxN <it>Apis</it>, TR; SoxN <it>melanogaster</it>, Q9U1H5; SoxN <it>pseudoobscura</it>, TR; SoxN <it>Anopheles</it>, TR; SoxB <it>s. ratti</it>1, BI323817; SoxB <it>s. ratti</it>2, BI323817; SoxB <it>W. bancrofti</it>, CD455919; SOX2 <it>C. elegans</it>, Q21305; SOXB1 <it>H. roretzi</it>, Q86SB8; SOX19 Zebra Fish, P47792; SOX21 Zebra Fish, Q9YH21; SOX14 Chicken, Q9W7R6; SOX14 Human, O95416; SOX14 Mouse, Q04892; SOX14 Platypus, Q8MIP4; SOX21 Human, Q9Y651; SOX21 Mouse, Q811W0; SOX21 Chicken, Q9W7R5; SOXB2 <it>P. flava</it>, (Taguchi <it>et. al. </it>2002); Sox21a <it>melanogaster</it>, Q9VUD1; Sox21a <it>pseudoobscura</it>, TR; Sox21a <it>Anopheles</it>, TR; Sox21a <it>Apis</it>, TR; SOXB2 Sea Urchin, Q9Y0D8; SoxB <it>C. virginica</it>, CD648628; SOX3 <it>C. elegans</it>, Q20201; SoxB <it>T. spiralis</it>, BG302262; SoxB <it>M. hapla</it>, BU095063; SRY Human, Q05066; SRY Sea Lion, AAR10360; SRY Mouse, Q05738. TR = This report.</p>
						</text>
						<graphic file="1471-2156-6-26-5"/>
					</fig>
					<p>Taken together, the analysis presented here shows that the genomic organization and sequence of group B <it>Sox </it>genes have been conserved during insect evolution. Particularly striking is the clustering of three genes in a small region of the genome. The structure of these genes and their relationship with vertebrate Group B genes suggest that <it>SoxN </it>and <it>Sox21a </it>are homologous to vertebrate group B1 and B2 genes respectively, whereas <it>Dichaete </it>and <it>Sox21b </it>may represent insect-specific group B genes.</p>
				</sec>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>The sequence alignments of the HMG DNA-binding domains from insect and mammalian group B Sox proteins suggests that the insect proteins may be separated into three distinct groups. The first, containing SoxN, aligns with the vertebrate Sox1, 2 and 3 proteins and most likely represents an orthologue of the vertebrate group B1 class. This conclusion, based on sequence, is supported by the functional analysis of group B1 proteins in vertebrates and <it>Drosophila</it>. In both cases, group B1 genes are expressed from the earliest stages of CNS development and are implicated in regulating early neural specification <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>. In addition, we have evidence that mammalian <it>Sox1 </it>genes can rescue <it>SoxN </it>phenotypes in the <it>Drosophila </it>CNS, supporting the view that these proteins are functionally conserved (P. Overton and S.R. unpublished observations). The group B sequences isolated from the basal chordates, acorn worm and sea squirt, have also been shown to be expressed early in the specification of the CNS <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>. Thus, it appears that all metazoans studied to date have at least one group B gene with expression marking neural lineages early in development. Further studies of primitive invertebrates will determine whether group B <it>Sox </it>expression is a universal marker for CNS development.</p>
			<p>In a previously published phylogenetic studies it was suggested that <it>Dichaete </it>be classified as a Group B2 protein <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. However, while the analysis clearly differentiates between the group B proteins and other fly Sox proteins it could not unambiguously resolve the relationship between each of the group B proteins. In terms of function and expression, the <it>Dichaete </it>gene behaves very much like a group B1 gene, it is expressed early during CNS development and is required for neural differentiation <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B42">42</abbr></abbrgrp>. We have previously shown that the mouse <it>Sox2 </it>gene efficiently rescues <it>Dichaete </it>phenotypes, further supporting a functionally similarity between <it>Dichaete </it>and vertebrate group B1 genes <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B42">42</abbr></abbrgrp>. In contrast to the conclusion based on functional studies, the sequence analysis suggests that insect Dichaete DNA-binding domain sequences are markedly different from other group B1 proteins and are more similar to group B2 proteins. The conservation of the insect sequences indicates that a <it>Dichaete-</it>like sequence was present at least 300 My years ago, when <it>Apis </it>and the Diptera last shared a common ancestor <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. We believe that the functional evidence is more convincing than the arguments based on sequence alignments and therefore suggest that Dichaete represents a group B1 function that has diverged from the canonical group B1 sequence, presumably due to selection for insect-specific functions. For example, Dichaete is required for early segmentation in the <it>Drosophila </it>embryo, a highly derived function, and it may be that sequence changes in the HMG-domain have been selected for such a function while still allowing a role in CNS-specification. As with <it>Drosophila</it>, both <it>Anopheles </it>and <it>Apis </it>are long germ insects that share some aspects of early development such as the early appearance of striped domains of <it>even skipped </it>expression <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. Thus it is possible that insect <it>Dichaete </it>genes have a common role in early patterning events. It will be of considerable interest to examine the complement of group B <it>Sox </it>genes in Coleoptera, Homoptera or Orthoptera to see if the HMG domain sequence and gene organisation is the same as the insects so far sequenced. To investigate this we used the Dichaete DNA-binding domain to search the available sequence of the silk moth <it>Bombyx mori</it>. <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> and found a single Group B gene that was clearly an orthologue of the <it>Dichaete </it>genes discussed here, containing the diagnostic Leucine and Isoleucine residues described here.</p>
			<p>As with vertebrate group B1 genes, <it>SoxN </it>and <it>Dichaete </it>are expressed in broadly overlapping domains and act partially redundantly in CNS specification <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. The close similarity between the expression and function of <it>SoxN </it>and <it>Dichaete </it>in the CNS raises the possibility that they arose from a common ancestor by a duplication event and may thus share some common regulatory sequences. However, when we compared the sequences 5' or 3' to <it>SoxN </it>with the <it>Dichaete </it>3' sequence we could not detect any sequence similarity indicating that any conservation in regulatory sequences is not visible at a large scale; this is not entirely surprising since we cannot detect any sequence similarity between the <it>Dichaete </it>regulatory sequences from <it>Drosophila </it>and <it>Anopheles</it>, while our analysis indicates the divergence of <it>SoxN </it>and <it>Dichaete </it>predates the <it>Drosophila</it>-<it>Anopheles </it>divergence.</p>
			<p>Based on the sequence alignment of insect Sox21a DNA-binding domains with those of vertebrate Sox14 proteins, it is possible that Sox21a may be an orthologue of the group B2 class. It has been suggested that in chicken Sox14 and Sox21 act as antagonists of group B1 function in a subset of the developing CNS <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The function of <it>Sox21a </it>in <it>Drosophila </it>is not known at present, however, <it>Sox21a </it>is expressed late in the development of the embryonic CNS midline, a site of <it>SoxN </it>and <it>Dichaete </it>expression, indicating there is the potential for the type of antagonistic interaction proposed for vertebrates. The Sox21b DNA-binding domain sequence indicates that it is closely related to Dichaete. Both these proteins have a set of unique residues in their DNA-binding domains that are not found in any other group B proteins identified to date. The <it>Sox21b </it>gene is conserved between the insects and its close similarity to <it>Dichaete </it>suggests that both genes arose from a common origin in the ancestor of the arthropods after their divergence from the nematodes since there is no close sequence in <it>C. elegans </it>or its relatives. In terms of expression, <it>Sox21b </it>is expressed in the large hindgut along with Dichaete, supporting the possibility that it may also antagonise the activity of Dichaete. In this respect then <it>Sox21b </it>may represent a group B2 function. It is therefore possible that insects contain 2 group B1 class activities, involved in early CNS development, and two B2 class genes. Again we emphasise that the functional assignment of the insect genes may contrast with the data derived from sequence analysis, which predicts a single group B1 gene and three group B2 genes. We suggest that the separation of group B Sox domains into a B1 class and B2 class based solely on sequence does not reflect meaningful functional differences in insects. We have initiated a functional analysis of <it>Sox21a </it>and <it>Sox21b </it>in the hope that we can clarify this issue.</p>
			<p>The genome organisation of the Dichaete cluster is unusual, not only are three genes clustered together in the genome but two of them, <it>Sox21a </it>and <it>Sox21b</it>, have introns within the HMG-domain. The single <it>Sox21a </it>intron is conserved in all four of the insect genes suggesting that it is ancestral to the insects. <it>Sox21b </it>is more complex, there are six introns in <it>melanogaster </it>and <it>pseudoobscura</it>, four of these are conserved in <it>Anopheles </it>and two are conserved in <it>Apis</it>. In the <it>Drosophila </it>species, there are two introns in the DNA-binding domain, the first of which is present in all four insects. The second intron, in an identical location to the <it>Sox21a </it>intron, is only found in the two <it>Drosophila </it>species. A simple model of a single intron loss is therefore unlikely to account for this since both <it>Apis </it>and <it>Anopheles </it>do not have the intron. It is possible that <it>Apis </it>and <it>Anophelese </it>lost the intron independently or, alternatively, that the common ancestor of the <it>Drosophila </it>species gained the intron, perhaps via a gene conversion event with <it>Sox21a</it>. Interestingly, the two group B genes from <it>C. elegans </it>also contain introns in the DNA-binding domain, in identical positions in both genes, but they are in different positions to the <it>Sox21a </it>and <it>Sox21b </it>introns. This suggests that the common ancestor of insects and nematodes did not contain DNA-binding domain introns and that these have been acquired independently in both lineages.</p>
			<p>The conservation of genome structure with the insect <it>Dichaete </it>cluster suggests that there may be functional constraints on the organisation. We suggest that this is likely to be a reflection of shared regulatory sequence since the region between <it>Dichaete </it>and <it>Sox21b </it>in <it>melanogaster </it>contains extensive regulatory sequences essential for correct <it>Dichaete </it>expression. We note that both <it>Sox21a </it>and <it>Sox21b </it>have expression domains that overlap with <it>Dichaete</it>, in the midline for <it>Sox21a </it>and the hindgut with <it>Sox21b</it>. These expression domains may therefore be controlled by common regulatory sequences and the need to maintain coordinated regulation of the three genes has maintained the integrity of the cluster in the insects. The conservation in expression between <it>D. melanogaster </it>and <it>D. pseudoobscura </it>is consistent with this view; it will be of interests to examine the expression of the all of the <it>Sox </it>genes in <it>Anopheles </it>to further explore this hypothesis.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>Taking our observation together, we propose a simple model for the evolution of group B <it>SOX </it>genes (Figure <figr fid="F6">6</figr>). We base our model on the proposal of Kirby <it>et. al. </it><abbrgrp><abbr bid="B13">13</abbr></abbrgrp> who suggest that a single group B gene underwent a duplication to generate two <it>Sox3- </it>like genes. We propose that these are represented by <it>SoxN </it>and Dichaete in the insects. A further tandem duplication of one of these genes generated linked group B1 and group B2 genes. We propose that this is represented by <it>Dichaete </it>duplicating to generate <it>Sox21a</it>. <it>Sox21a </it>would then acquire the sequence changes characteristic of the group B2 class of proteins. We suggest that these events predate the Protostome-Deuterostome divergence over 650 My ago <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> and provide the basal <it>Sox </it>gene complement of the Bilateria of three group B <it>Sox </it>genes. After the divergence of the lineages leading to vertebrates and invertebrates, Dichaete diverged from the canonical group B DNA-binding domain sequence and then underwent further duplication event, at least predating the divergence of the holometabolous insects, to generate <it>Sox21b</it>. An analysis of the group B family in other insects and basal chordates will be required to definitively describe the ancestral situation.</p>
			<fig id="F6">
				<title>
					<p>Figure 6</p>
				</title>
				<caption>
					<p>A model for the evolution of Group B <it>Sox </it>genes in insects following the proposal of Kirby <it>et al </it>(2002) for vertebrates</p>
				</caption>
				<text>
					<p>A model for the evolution of Group B <it>Sox </it>genes in insects following the proposal of Kirby <it>et al </it>(2002) for vertebrates. In this view an ancestral group B gene is duplicated during an ancient genome duplication event to generate <it>Dichaete </it>and <it>SoxN</it>. A tandem duplication of Dichaete generates <it>Sox21a</it>; these events would be common to the ancestor of vertebrates and invertebrates. In insects, a further duplication of <it>Dichaete </it>gives rise to <it>Sox21b</it>.</p>
				</text>
				<graphic file="1471-2156-6-26-6"/>
			</fig>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<sec>
				<st>
					<p>Genome sequences</p>
				</st>
				<p>The following sources were used to obtain genome sequence: <it>D. melanogaster </it>(Release 3.2, <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>) from FlyBase <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and the following scaffolds were used; AE003535 for the <it>Dichaete </it>region and AE003622 for the <it>SoxN </it>region. <it>D. pseudoobscura </it>(Freeze_1 assembly) was obtained from the Human Genome Sequencing Center, Baylor College of Medicine (HGSC-BCM <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>) and the following scaffolds used; Contig5946_Contig6670 for the <it>Dichaete </it>region and Contig1741_Contig5707 for the <it>SoxN </it>region. <it>Anopheles gambiae </it>genome sequence release 19.2a.1, compiled by the International <it>Anopheles </it>Genome Project <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, was obtained from the Ensembl server at the Wellcome Trust Sanger Institute <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. In the Ensemble annotation the <it>Sox </it>genes have the following accessions: <it>SoxN </it>(ENSANGG00000019842), <it>Dichaete </it>(ENSANGG00000010137), <it>Sox21a </it>(ENSANGG00000010002) and <it>Sox21b </it>(ENSANGG00000009947). <it>Anopheles </it>EST sequences representing <it>Dichaete </it>(TC44994) and <it>Sox21b </it>(TC45155) were obtained from The Institute for Genome Research <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. <it>Apis mellifera </it>Genome assembly Amel_1.1 was obtained from HGSC-BCM <abbrgrp><abbr bid="B54">54</abbr></abbrgrp> and the following scaffolds used: for the <it>Dichaete </it>region, Group8.12 (<it>Dichaete </it>and <it>Sox21b</it>) was found to overlap by 4.5 kb with GroupUn.570 containing <it>Sox21a </it>and the sequences were combined into a single contig. <it>SoxN </it>was contained within Group17.6. In addition a search of the Honey Bee Brain EST project <abbrgrp><abbr bid="B55">55</abbr><abbr bid="B56">56</abbr></abbrgrp> uncovered two EST sequences corresponding to <it>Dichaete </it>(BB170009A10D01) and <it>Sox21b </it>(BB170011B20A11). These were used to verify the exon predictions from the genome sequence. Vertebrate group B sequences were obtained from Uniprot. Nematode sequences were recovered by Blast searches of the EST collections at Nematode.net (Genome Sequencing Center, Washington University, St Louis, <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>).</p>
			</sec>
			<sec>
				<st>
					<p>Informatics tools</p>
				</st>
				<p>Homology searching was performed using the Blast algorithm <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> at Sanger, HGSC-BCM and Berkeley Drosophila Genome Project <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> web sites. Genomic sequences were imported into Artemis v5 <abbrgrp><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp> and annotated manually using the Blast output as a guide. Multiple Sequence alignments were performed locally using ClustalXv1.8 <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> and graphically represented with BoxShade <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. The alignment of intergenic regions was performed with OWEN <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Molecular biology</p>
				</st>
				<p>A cDNA clone for <it>Sox21b </it>(GH07353) was obtained from the <it>Drosophila </it>gene collection <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and sequenced on both strands using an ABI prism kit in the Genetics Department sequencing core. PCR and RT-PCR amplifications were carried out using minor modifications to standard techniques <abbrgrp><abbr bid="B65">65</abbr></abbrgrp> using the following primer combinations:</p>
				<p><it>Melanogaster </it>primers for RT-PCR</p>
				<p>Dichaete F ACAATCCATTCCATCAACTACC</p>
				<p>Dichaete R TTGGTGTTCCCTCCTTACTC</p>
				<p>Sox21B F AGTCTCATGAACAGCGGAAG</p>
				<p>Sox21B R GGAGTTGCTCAGATACGACG</p>
				<p>SoxN F CAGCAGCAACAGCAACACTAC</p>
				<p>SoxN R TTTCATCGCCTCGCCACAAC</p>
				<p><it>Pseudoobscura </it>primers for <it>in situ </it>probes:</p>
				<p>Dp-Dichaete F CGAACTACGGATTCCACCT</p>
				<p>Dp-Dichaete R CATTCCGTTGGCCTGCAT</p>
				<p>Dp-SoxN F AGCTGAGTCACCATAACCAC</p>
				<p>Dp-soxN R GTCATGTGATGGCTACCAA</p>
				<p>Dp-Sox21A Exon1 F GAGCATCTCGACGCTACTAC</p>
				<p>Dp-Sox21A Exon 1 R GGAATTGGAGTGGCTATGAT</p>
				<p>Dp-Sox21A Exon 2 F CTAAGGACATGCAGTCACAG</p>
				<p>Dp-Sox21A Exon 2 R GACTTCACGCAGCCGTAGGAT</p>
				<p>Dp-Sox21B F CGTCTATCCACACACCTGTC</p>
				<p>Dp-Sox21B R GACGATGTCTGCTGCTGTT</p>
				<p>Whole-mount <it>in situ </it>hybridisation to <it>Drosophila </it>embryos was performed using minor modifications to a standard protocol <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>.</p>
				<p>All genetic nomenclature is according to FlyBase <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>C.McK. performed the sequencing, mapping of the <it>Drosophila Sox21a </it>and <it>Sox21b </it>genes in and the <it>in situ </it>hybridisation experiments. G.W. carried out the RT-PCR analysis. S.R. designed the experiments, carried out the genomic analysis and wrote the paper.</p>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>This work was supported by an UK-Medical Research Council programme grant to S.R., M. Ashburner and D. Gubb. We are grateful to M. Ashburner for comments on the manuscript and to S. Marcellini and J. Roote for assistance with the D. pseudoobscura husbandry.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>A gene from the human sex determining region encodes a protein with homology to a conserved DNA binding motif.</p>
				</title>
				<aug>
					<au>
						<snm>Sinclair</snm>
						<fnm>AH</fnm>
					</au>
					<au>
						<snm>Berta</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Palmer</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Hawkins</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Griffiths</snm>
						<fnm>BL</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Foster</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Frischauf</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Goodfellow</snm>
						<fnm>PN</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1990</pubdate>
				<volume>346</volume>
				<fpage>240</fpage>
				<lpage>244</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/346240a0</pubid>
						<pubid idtype="pmpid" link="fulltext">1695712</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Twenty pairs of Sox: extent, homology, and nomenclature of the mouse and human Sox transcription factor gene families. Mol.</p>
				</title>
				<aug>
					<au>
						<snm>Schepers</snm>
						<fnm>GE</fnm>
					</au>
					<au>
						<snm>Teasdale</snm>
						<fnm>RD</fnm>
					</au>
					<au>
						<snm>Koopman</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>167</fpage>
				<lpage>170</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1016/S1534-5807(02)00223-X</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Phylogeny of the SOX family of developmental transcription factors based on sequence and structural indicators.</p>
				</title>
				<aug>
					<au>
						<snm>Bowles</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Schepers</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Koopman</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Dev Biol</source>
				<pubdate>2000</pubdate>
				<volume>227</volume>
				<fpage>239</fpage>
				<lpage>255</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/dbio.2000.9883</pubid>
						<pubid idtype="pmpid" link="fulltext">11071752</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>A comparison of the properties of SOX3 with SRY and two related genes SOX1 and SOX2.</p>
				</title>
				<aug>
					<au>
						<snm>Collignon</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Sockanathan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hacker</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cohen-Tannoudji</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Norris</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Rastan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Stevanovic</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Goodfellow</snm>
						<fnm>PN</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>1996</pubdate>
				<volume>122</volume>
				<fpage>509</fpage>
				<lpage>520</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8625802</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>The isolation and high-resolution mapping of human SOX14 and SOX21 ; two members of the SOX gene family related to SOX1, SOX2 and SOX3.</p>
				</title>
				<aug>
					<au>
						<snm>Malas</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Duthies</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Deloukas</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Episkopou</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Mamm Genome</source>
				<pubdate>1999</pubdate>
				<volume>10</volume>
				<fpage>934</fpage>
				<lpage>937</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s003359901118</pubid>
						<pubid idtype="pmpid" link="fulltext">10441749</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Two distinct group B SOX genes for transcriptional activators and repressors: their expression during embryonic organogenesis of the chicken.</p>
				</title>
				<aug>
					<au>
						<snm>Uchikawa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kamachi</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Kondoh</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Mech Dev</source>
				<pubdate>1999</pubdate>
				<volume>84</volume>
				<fpage>103</fpage>
				<lpage>120</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0925-4773(99)00083-0</pubid>
						<pubid idtype="pmpid">10473124</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>SOX3 is an X-linked gene related to SRY.</p>
				</title>
				<aug>
					<au>
						<snm>Stevanovic</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Collignon</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Goodfellow</snm>
						<fnm>PN</fnm>
					</au>
				</aug>
				<source>Hum Mol Genet</source>
				<pubdate>1993</pubdate>
				<volume>2</volume>
				<fpage>2013</fpage>
				<lpage>2018</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8111369</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>An SRY-related sequence on the marsupial X chromosome: implications for the evolution  of the mammalian testis-determining gene.</p>
				</title>
				<aug>
					<au>
						<snm>Foster</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Graves</snm>
						<fnm>JA</fnm>
					</au>
				</aug>
				<source>Proc Nat Acad Sci USA</source>
				<pubdate>1994</pubdate>
				<volume>91</volume>
				<fpage>1927</fpage>
				<lpage>1931</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">43277</pubid>
						<pubid idtype="pmpid" link="fulltext">8127908</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Characterisation and mapping of the human SOX14 gene.</p>
				</title>
				<aug>
					<au>
						<snm>Arsic</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Rajic</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Stanojcic</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Goodfellow</snm>
						<fnm>PN</fnm>
					</au>
					<au>
						<snm>Stevanovic.</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Cytogenet Cell Genet</source>
				<pubdate>1998</pubdate>
				<volume>83</volume>
				<fpage>139</fpage>
				<lpage>146</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1159/000015149</pubid>
						<pubid idtype="pmpid" link="fulltext">9925951</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>The cDNA sequence and chromosomal location of the human SOX2 gene.</p>
				</title>
				<aug>
					<au>
						<snm>Stevanovic</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Zuffardi</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Collignon</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Goodfellow</snm>
						<fnm>PN</fnm>
					</au>
				</aug>
				<source>Mamm Genome</source>
				<pubdate>1994</pubdate>
				<volume>5</volume>
				<fpage>640</fpage>
				<lpage>642</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/BF00411460</pubid>
						<pubid idtype="pmpid">7849401</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Cloning and mapping of human SOX1: a highly conserved gene expressed in the developing brain.</p>
				</title>
				<aug>
					<au>
						<snm>Malas</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Duthie</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Mohri</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Episkopou</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Mamm Genome</source>
				<pubdate>1997</pubdate>
				<volume>8</volume>
				<fpage>866</fpage>
				<lpage>868</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s003359900597</pubid>
						<pubid idtype="pmpid" link="fulltext">9337405</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Chromosome assignment of eight SOX family genes in chicken. Cytogenet.</p>
				</title>
				<aug>
					<au>
						<snm>Kuroiwa</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Uchikawa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kamachi</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Kondoh</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Nishida-Umehara</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Masabanda</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Griffin</snm>
						<fnm>DK</fnm>
					</au>
					<au>
						<snm>Matsuda</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>98</volume>
				<fpage>189</fpage>
				<lpage>193</lpage>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Cloning and mapping of platypus SOX2 and SOX14: Insights into SOX group B evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Kirby</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Waters</snm>
						<fnm>PD</fnm>
					</au>
					<au>
						<snm>Delbridge</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Svartman</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Stewart</snm>
						<fnm>AN</fnm>
					</au>
					<au>
						<snm>Nagai</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Graves</snm>
						<fnm>JAM</fnm>
					</au>
				</aug>
				<source>Cytogenetic and Genome Research</source>
				<pubdate>2002</pubdate>
				<volume>98</volume>
				<fpage>96</fpage>
				<lpage>100</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1159/000068539</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Mouse Genome Informatics: http://www.informatics.jax.org/</p>
				</title>
			</bibl>
			<bibl id="B15">
				<title>
					<p>MGD: The mouse genome database.</p>
				</title>
				<aug>
					<au>
						<snm>Blake</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Richardson</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Bult</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Kadin</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Eppig</snm>
						<fnm>JT</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>193</fpage>
				<lpage>195</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165494</pubid>
						<pubid idtype="pmpid" link="fulltext">12519980</pubid>
						<pubid idtype="doi">10.1093/nar/gkg047</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome.</p>
				</title>
				<aug>
					<au>
						<snm>Bergman</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Pfeiffer</snm>
						<fnm>BD</fnm>
					</au>
					<au>
						<snm>Rincon-Limas</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Hoskins</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Gnirke</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Mungal</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Kronmiller</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Pacleb</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Stapleton</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Wan</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>George</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Jong</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Botas</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Genome Biology</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>RESEARCH0086.</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">151188</pubid>
						<pubid idtype="pmpid" link="fulltext">12537575</pubid>
						<pubid idtype="doi">10.1186/gb-2002-3-12-research0086</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Progress and Prospects in Evolutionary Biology: The Drosophila Model.</p>
				</title>
				<aug>
					<au>
						<snm>Powell</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<publisher>Oxford, Oxford University Press.</publisher>
				<pubdate>1997</pubdate>
			</bibl>
			<bibl id="B18">
				<title>
					<p>An insect molecular clock dates the origin of the insects and accords with palaeontological and biogeographic landmarks.</p>
				</title>
				<aug>
					<au>
						<snm>Gaunt</snm>
						<fnm>MW</fnm>
					</au>
					<au>
						<snm>Miles</snm>
						<fnm>MA</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2002</pubdate>
				<volume>19</volume>
				<fpage>748</fpage>
				<lpage>761</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11961108</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Annotation of 12 Drosophila Genomes: http://rana.lbl.gov/drosophila/multipleflies.html</p>
				</title>
			</bibl>
			<bibl id="B20">
				<title>
					<p>The Drosophila Sox-domain protein Dichaete is required for the development of the central nervous system midline.</p>
				</title>
				<aug>
					<au>
						<snm>Sanchez-Soriano</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>1998</pubdate>
				<volume>125</volume>
				<fpage>3989</fpage>
				<lpage>3996</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9735360</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Formation of neuroblasts in the embryonic central nervous system of Drosophila melanogaster is controlled by SoxNeuro.</p>
				</title>
				<aug>
					<au>
						<snm>Buescher</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hing</snm>
						<fnm>FS</fnm>
					</au>
					<au>
						<snm>Chia</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>2002</pubdate>
				<volume>129</volume>
				<fpage>4193</fpage>
				<lpage>4203</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12183372</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Evidence for differential and redundant function of the Sox genes Dichaete and SoxN during CNS development in Drosophila.</p>
				</title>
				<aug>
					<au>
						<snm>Overton</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Meadows</snm>
						<fnm>LA</fnm>
					</au>
					<au>
						<snm>Urban</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>2002</pubdate>
				<volume>129</volume>
				<fpage>4219</fpage>
				<lpage>4228</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12183374</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Sox1 directly regulates the g-crystallin genes and is essential for lens development in mice.</p>
				</title>
				<aug>
					<au>
						<snm>Nishiguchi</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Wood</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Kondoh</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Episkopou</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Genes &amp; Development</source>
				<pubdate>1998</pubdate>
				<volume>12</volume>
				<fpage>776</fpage>
				<lpage>781</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">316632</pubid>
						<pubid idtype="pmpid" link="fulltext">9512512</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Roles of Sox factors in neural determination: conserved signaling in evolution?</p>
				</title>
				<aug>
					<au>
						<snm>Sasai</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Int J Dev Biol</source>
				<pubdate>2001</pubdate>
				<volume>45</volume>
				<fpage>321</fpage>
				<lpage>326</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11291862</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Genome-wide analysis of Sox genes in Drosophila melanogaster.</p>
				</title>
				<aug>
					<au>
						<snm>Cremazy</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Berta</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Girard</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Mech Dev</source>
				<pubdate>2001</pubdate>
				<volume>109</volume>
				<fpage>371</fpage>
				<lpage>375</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0925-4773(01)00529-9</pubid>
						<pubid idtype="pmpid" link="fulltext">11731252</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>The Dichaete gene of Drosophila melanogaster encodes a Sox-domain protein required for embryonic segmentation.</p>
				</title>
				<aug>
					<au>
						<snm>Russell</snm>
						<fnm>SRH</fnm>
					</au>
					<au>
						<snm>Sanchez-Soriano</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Wright</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Ashburner</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>1996</pubdate>
				<volume>122</volume>
				<fpage>3669</fpage>
				<lpage>3676</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8951082</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>SoxNeuro, a new Drosophila Sox gene expressed in the developing central nervous system.</p>
				</title>
				<aug>
					<au>
						<snm>Cremazy</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Berta</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Girard</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Mech Dev</source>
				<pubdate>2000</pubdate>
				<volume>93</volume>
				<fpage>215</fpage>
				<lpage>219</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0925-4773(00)00268-9</pubid>
						<pubid idtype="pmpid" link="fulltext">10781960</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>The Drosophila fish-hook gene encodes an HMG domain protein essential for segmentation and CNS development.</p>
				</title>
				<aug>
					<au>
						<snm>Nambu</snm>
						<fnm>PA</fnm>
					</au>
					<au>
						<snm>Nambu</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>1996</pubdate>
				<volume>122</volume>
			</bibl>
			<bibl id="B29">
				<title>
					<p>The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes.</p>
				</title>
				<aug>
					<au>
						<snm>Stapleton</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Liao</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Brokstein</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Carninci</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Shiraki</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hayashizaki</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Champe</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Pacleb</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wan</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Carlson</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>George</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>1294</fpage>
				<lpage>1300</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">186637</pubid>
						<pubid idtype="pmpid" link="fulltext">12176937</pubid>
						<pubid idtype="doi">10.1101/gr.269102</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Evolution and phylogeny of the Diptera: a molecular phylogenetic analysis using 28S rDNA sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Friedrich</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Tautz</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Syst Biol</source>
				<pubdate>1997</pubdate>
				<volume>46</volume>
				<fpage>674</fpage>
				<lpage>698</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11975338</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Developmental-specific activity of the FGF-4 enhancer requires the synergistic action of Sox2 and Oct-3.</p>
				</title>
				<aug>
					<au>
						<snm>Yuan</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Corbi</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Basilico</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Dailey</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Genes &amp; Dev</source>
				<pubdate>1995</pubdate>
				<volume>9</volume>
				<fpage>2635</fpage>
				<lpage>2645</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7590241</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Cis-analyst: http://rana.lbl.gov/cis-analyst/</p>
				</title>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome</p>
				</title>
				<aug>
					<au>
						<snm>Berman</snm>
						<fnm>BP</fnm>
					</au>
					<au>
						<snm>Nibu</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Pfeiffer</snm>
						<fnm>BD</fnm>
					</au>
					<au>
						<snm>Tomancak</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Levine</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>MB</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci U S A</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>757</fpage>
				<lpage>762</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">117378</pubid>
						<pubid idtype="pmpid" link="fulltext">11805330</pubid>
						<pubid idtype="doi">10.1073/pnas.231608898</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Vista Genome Browser: http://pipeline.lbl.gov/pseudo</p>
				</title>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Strategies and Tools for Whole-Genome Alignments.</p>
				</title>
				<aug>
					<au>
						<snm>Couronne</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Poliakov</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Bray</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Ishkhanov</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Ryaboy</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Pachter</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Dubchak</snm>
						<fnm>I</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2003</pubdate>
				<volume>13</volume>
				<fpage>73</fpage>
				<lpage>80</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">430965</pubid>
						<pubid idtype="pmpid" link="fulltext">12529308</pubid>
						<pubid idtype="doi">10.1101/gr.762503</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Regulatory mutations of the Drosophila Sox gene Dichaete reveal new functions in embryonic brain and hindgut development.</p>
				</title>
				<aug>
					<au>
						<snm>Sanchez-Soriano</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Dev Biol</source>
				<pubdate>2000</pubdate>
				<volume>129</volume>
				<fpage>1165</fpage>
				<lpage>1174</lpage>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Q: a new retrotransposon from the mosquito Anopheles gambiae.</p>
				</title>
				<aug>
					<au>
						<snm>Besansky</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Bedell</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Mukabayire</snm>
						<fnm>O</fnm>
					</au>
				</aug>
				<source>Insect Mol Biol</source>
				<pubdate>1994</pubdate>
				<volume>3</volume>
				<fpage>49</fpage>
				<lpage>56</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8069416</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Multipotent cell lineages in early mouse development depend on SOX2 function.</p>
				</title>
				<aug>
					<au>
						<snm>Avilion</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Nicolis</snm>
						<fnm>SK</fnm>
					</au>
					<au>
						<snm>Pevny</snm>
						<fnm>LH</fnm>
					</au>
					<au>
						<snm>Perez</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Vivian</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Lovell-Badge</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Genes &amp; Development</source>
				<pubdate>2003</pubdate>
				<volume>17</volume>
				<fpage>126</fpage>
				<lpage>140</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">195970</pubid>
						<pubid idtype="pmpid" link="fulltext">12514105</pubid>
						<pubid idtype="doi">10.1101/gad.224503</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>SOX2 functions to maintain neural progenitor identity.</p>
				</title>
				<aug>
					<au>
						<snm>Graham</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Khudyakov</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ellis</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Pevny</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Neuron</source>
				<pubdate>2003</pubdate>
				<volume>39</volume>
				<fpage>749</fpage>
				<lpage>765</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0896-6273(03)00497-5</pubid>
						<pubid idtype="pmpid" link="fulltext">12948443</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Group B Sox genes that contribute to specification of the vertebrate brain are expressed in the apical organ and ciliary bands of hemichordate larvae.</p>
				</title>
				<aug>
					<au>
						<snm>Taguchi</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tagawa</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Humphreys</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Satoh</snm>
						<fnm>N</fnm>
					</au>
				</aug>
				<source>Zool Sci</source>
				<pubdate>2002</pubdate>
				<volume>19</volume>
				<fpage>57</fpage>
				<lpage>66</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.2108/zsj.19.57</pubid>
						<pubid idtype="pmpid" link="fulltext">12025405</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Expression pattern and transcriptional control of SoxB1 in embryos of the ascidian Halocynthia roretzi.</p>
				</title>
				<aug>
					<au>
						<snm>Miya</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Nishida</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Zool Sci</source>
				<pubdate>2003</pubdate>
				<volume>20</volume>
				<fpage>59</fpage>
				<lpage>67</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.2108/zsj.20.59</pubid>
						<pubid idtype="pmpid" link="fulltext">12560602</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>The Sox-domain containing gene Dichaete/fish-hook acts in concert with vnd and ind to regulate cell fate in the Drosophila neuroectoderm.</p>
				</title>
				<aug>
					<au>
						<snm>Zhao</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Skeath</snm>
						<fnm>JB</fnm>
					</au>
				</aug>
				<source>Development</source>
				<pubdate>2002</pubdate>
				<volume>129</volume>
				<fpage>1165</fpage>
				<lpage>1174</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11874912</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Short, long and beyond: Molecular and embryological approaches to insect segmentation.</p>
				</title>
				<aug>
					<au>
						<snm>Davis</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Patel</snm>
						<fnm>NH</fnm>
					</au>
				</aug>
				<source>Ann Rev Entomol</source>
				<pubdate>2002</pubdate>
				<volume>47</volume>
				<fpage>669</fpage>
				<lpage>699</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1146/annurev.ento.47.091201.145251</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Different combinations of gap repressors for common stripes in Anopheles and Drosophila embryos.</p>
				</title>
				<aug>
					<au>
						<snm>Goltsev</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Hsiong</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lanzaro</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Levine</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Dev Biol</source>
				<pubdate>2004</pubdate>
				<volume>275</volume>
			</bibl>
			<bibl id="B45">
				<title>
					<p>A draft sequence for the genome of the domesticated silkworm (Bombyx mori).</p>
				</title>
				<aug>
					<au>
						<snm>Xia</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Lu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Cheng</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Dai</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Zha</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Cheng</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Chai</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Pan</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Lin</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Qian</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Pan</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Shen</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Lan</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Yuan</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Wan</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zhu</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>M</snm>
						<fnm>MY</fnm>
					</au>
					<au>
						<snm>Shen</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Xiang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Shi</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Su</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>QW</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Wei</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Dong</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Meng</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Lan</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>0Fang</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>D</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Xi</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Qi</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>He</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Cao</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ye</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ye</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ji</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ni</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zheng</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Mao</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Ye</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2004</pubdate>
				<volume>306</volume>
				<fpage>1937</fpage>
				<lpage>1940</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1102210</pubid>
						<pubid idtype="pmpid" link="fulltext">15591204</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Origin of the metazoan phyla: Molecular clocks confirm paleontolgical estimates.</p>
				</title>
				<aug>
					<au>
						<snm>Ayala</snm>
						<fnm>FJ</fnm>
					</au>
					<au>
						<snm>Rzhetsky</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Ayala</snm>
						<fnm>FJ</fnm>
					</au>
				</aug>
				<source>Proc Nat Acad Sci USA</source>
				<pubdate>1998</pubdate>
				<volume>95</volume>
				<fpage>606</fpage>
				<lpage>611</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">18467</pubid>
						<pubid idtype="pmpid" link="fulltext">9435239</pubid>
						<pubid idtype="doi">10.1073/pnas.95.2.606</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>Annotation of the Drosophila melanogaster euchromatic genome sequence: a systematic review.</p>
				</title>
				<aug>
					<au>
						<snm>Misra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Crosby</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Mungall</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Matthews</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>Hradecky</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Kaminker</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Millburn</snm>
						<fnm>GH</fnm>
					</au>
					<au>
						<snm>Prochnik</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Tupy</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Whitfied</snm>
						<fnm>EJ</fnm>
					</au>
					<au>
						<snm>Bayraktaroglu</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Berman</snm>
						<fnm>BP</fnm>
					</au>
					<au>
						<snm>Bettencourt</snm>
						<fnm>BR</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>AD</snm>
						<fnm>ADG</fnm>
					</au>
					<au>
						<snm>Drysdale</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>NL</fnm>
					</au>
					<au>
						<snm>Richter</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Russo</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Schroeder</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Shu</snm>
						<fnm>SQ</fnm>
					</au>
					<au>
						<snm>Stapleton</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Yamada</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Ashburner</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Gelbart</snm>
						<fnm>WM</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Lewis</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>RESEARCH0083</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">151185</pubid>
						<pubid idtype="pmpid" link="fulltext">12537572</pubid>
						<pubid idtype="doi">10.1186/gb-2002-3-12-research0083</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence.</p>
				</title>
				<aug>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Wheeler</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Kronmiller</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Carlson</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Halpern</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Patel</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Champe</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Dugan</snm>
						<fnm>SP</fnm>
					</au>
					<au>
						<snm>Frise</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Hodgson</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>George</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Hoskins</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Laverty</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Muzny</snm>
						<fnm>DM</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Pacleb</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Pfeiffer</snm>
						<fnm>BD</fnm>
					</au>
					<au>
						<snm>Richards</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sodergren</snm>
						<fnm>EJ</fnm>
					</au>
					<au>
						<snm>Svirskas</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Tabor</snm>
						<fnm>PE</fnm>
					</au>
					<au>
						<snm>Wan</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Stapleton</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Sutton</snm>
						<fnm>GG</fnm>
					</au>
					<au>
						<snm>Venter</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Weinstock</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Scherer</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Gibbs</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>CM</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>RESEARCH0079.</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">151181</pubid>
						<pubid idtype="pmpid" link="fulltext">12537568</pubid>
						<pubid idtype="doi">10.1186/gb-2002-3-12-research0079</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>FlyBase: genes and gene models.</p>
				</title>
				<aug>
					<au>
						<snm>Drysdale</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Crosby</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>.</snm>
						<fnm>TFBC</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Research</source>
				<pubdate>2005</pubdate>
				<volume>33:</volume>
				<fpage>390</fpage>
				<lpage>395</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1093/nar/gki046</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>D.pseudoobscura sequencing project: http://www.hgsc.bcm.tmc.edu/projects/drosophila/</p>
				</title>
			</bibl>
			<bibl id="B51">
				<title>
					<p>The genome sequence of the malaria mosquito Anopheles gambiae</p>
				</title>
				<aug>
					<au>
						<snm>Holt</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Subramanian</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Halpern</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sutton</snm>
						<fnm>GG</fnm>
					</au>
					<au>
						<snm>Charlab</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Nusskern</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Wincker</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Clark</snm>
						<fnm>AG</fnm>
					</au>
					<au>
						<snm>Ribeiro</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Wides</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Salzberg</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Loftus</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Yandell</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Majoros</snm>
						<fnm>WH</fnm>
					</au>
					<au>
						<snm>Rusch</snm>
						<fnm>DB</fnm>
					</au>
					<au>
						<snm>Lai</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Kraft</snm>
						<fnm>CL</fnm>
					</au>
					<au>
						<snm>Abril</snm>
						<fnm>JF</fnm>
					</au>
					<au>
						<snm>Anthouard</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Arensburger</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Atkinson</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Baden</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Berardinis</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Baldwin</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Benes</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Biedler</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Blass</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Bolanos</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Boscus</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Barnstead</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Cai</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Center</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Chaturverdi</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Christophides</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Chrystal</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Clamp</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Cravchik</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Curwen</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Dana</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Delcher</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Dew</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Evans</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Flanigan</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Grundschober-Freimoser</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Friedli</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Gu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Guan</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Guigo</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Hillenmeyer</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Hladun</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Hogan</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						<fnm>YS</fnm>
					</au>
					<au>
						<snm>Hoover</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Jaillon</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Ke</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Kodira</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kokoza</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Koutsos</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Letunic</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Levitsky</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Liang</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Lin</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Lobo</snm>
						<fnm>NF</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Malek</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>McIntosh</snm>
						<fnm>TC</fnm>
					</au>
					<au>
						<snm>Meister</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Mobarry</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Mongin</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Murphy</snm>
						<fnm>SD</fnm>
					</au>
					<au>
						<snm>O'Brochta</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Pfannkoch</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Qi</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Regier</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Remington</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>H</snm>
						<fnm>HS</fnm>
					</au>
					<au>
						<snm>Sharakhova</snm>
						<fnm>MV</fnm>
					</au>
					<au>
						<snm>Sitter</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Shetty</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Strong</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Thomasova</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Ton</snm>
						<fnm>LQ</fnm>
					</au>
					<au>
						<snm>Topalis</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Tu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Unger</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Walenz</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Woodford</snm>
						<fnm>KJ</fnm>
					</au>
					<au>
						<snm>Wortman</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Yao</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Zdobnov</snm>
						<fnm>EM</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Zhu</snm>
						<fnm>SC</fnm>
					</au>
					<au>
						<snm>Zhimulev</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Coluzzi</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Torre</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Roth</snm>
						<fnm>CW</fnm>
					</au>
					<au>
						<snm>Louis</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kalush</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Mural</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>HO</fnm>
					</au>
					<au>
						<snm>Broder</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gardner</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Fraser</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Brey</snm>
						<fnm>PT</fnm>
					</au>
					<au>
						<snm>Venter</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Weissenbach</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Kafatos</snm>
						<fnm>FC</fnm>
					</au>
					<au>
						<snm>Collins</snm>
						<fnm>FH</fnm>
					</au>
					<au>
						<snm>Hoffman</snm>
						<fnm>SL</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>298</volume>
				<fpage>129</fpage>
				<lpage>149</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1076181</pubid>
						<pubid idtype="pmpid" link="fulltext">12364791</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Mosquito genome browser: http://www.ensembl.org/Anopheles_gambiae/</p>
				</title>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Mosquito gene index: http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=mosquito</p>
				</title>
			</bibl>
			<bibl id="B54">
				<title>
					<p>A. mellifera sequencing project: http://www.hgsc.bcm.tmc.edu/projects/honeybee/</p>
				</title>
			</bibl>
			<bibl id="B55">
				<title>
					<p>Honey Bee EST Project:  http://titan.biotec.uiuc.edu/bee/honeybee_project.htm</p>
				</title>
			</bibl>
			<bibl id="B56">
				<title>
					<p>Annotated expressed sequence tags and cDNA microarrays for studies of brain and behaviour in the Honey Bee.</p>
				</title>
				<aug>
					<au>
						<snm>Whitfield</snm>
						<fnm>CW</fnm>
					</au>
					<au>
						<snm>Band</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Bonaldo</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Kumar</snm>
						<fnm>CG</fnm>
					</au>
					<au>
						<snm>Lui</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Pardinas</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Robertson</snm>
						<fnm>HM</fnm>
					</au>
					<au>
						<snm>Soares</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Robinson</snm>
						<fnm>GE</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>4</volume>
				<fpage>555</fpage>
				<lpage>566</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1101/gr.5302</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B57">
				<title>
					<p>Nematode.net: http://www.nematode.net/BLAST</p>
				</title>
			</bibl>
			<bibl id="B58">
				<title>
					<p>Basic local alignment search tool.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Gish</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1990</pubdate>
				<volume>215</volume>
				<fpage>403</fpage>
				<lpage>410</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.1990.9999</pubid>
						<pubid idtype="pmpid" link="fulltext">2231712</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>Berkeley Drosophila Genome Project: http://www.fruitfly.org</p>
				</title>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Artemis: http://www.sanger.ac.uk/Software/Artemis/</p>
				</title>
			</bibl>
			<bibl id="B61">
				<title>
					<p>Artemis: sequence visualisation and annotation.</p>
				</title>
				<aug>
					<au>
						<snm>Rutherford</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Parkhill</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Crook</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Horsnell</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Rice</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Rajandream</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Barrell</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2000</pubdate>
				<volume>16</volume>
				<fpage>944</fpage>
				<lpage>945</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/16.10.944</pubid>
						<pubid idtype="pmpid" link="fulltext">11120685</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>Multiple sequence alignment with the Clustal series of programs</p>
				</title>
				<aug>
					<au>
						<snm>Chenna</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Sugawara</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Koike</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
				</aug>
				<source>Nucl Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>3497</fpage>
				<lpage>3500</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">168907</pubid>
						<pubid idtype="pmpid" link="fulltext">12824352</pubid>
						<pubid idtype="doi">10.1093/nar/gkg500</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B63">
				<title>
					<p>Boxshade: http://www.ch.embnet.org/software/BOX_form.html</p>
				</title>
			</bibl>
			<bibl id="B64">
				<title>
					<p>OWEN: aligning long collinear regions of genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Ogurtsov</snm>
						<fnm>AY</fnm>
					</au>
					<au>
						<snm>Roytberg</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Shabalina</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Kondrashov</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2002</pubdate>
				<volume>18</volume>
				<fpage>1703</fpage>
				<lpage>1704</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/18.12.1703</pubid>
						<pubid idtype="pmpid" link="fulltext">12490463</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B65">
				<title>
					<p>Molecular Cloning: a laboratory manual.</p>
				</title>
				<aug>
					<au>
						<snm>Sambrook</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>DW</fnm>
					</au>
				</aug>
				<publisher>New York, Cold Spring Harbor Laboratory Press</publisher>
				<pubdate>2001</pubdate>
			</bibl>
			<bibl id="B66">
				<title>
					<p>A non-radioactive in situ hybridisation method for the localisation of specific RNAs in Drosophila embryos reveals translational control of the segmentation gene hunchback.</p>
				</title>
				<aug>
					<au>
						<snm>Tautz</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Pfeifle</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Chromosoma</source>
				<pubdate>1989</pubdate>
				<volume>98</volume>
				<fpage>81</fpage>
				<lpage>85</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/BF00291041</pubid>
						<pubid idtype="pmpid">2476281</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
