<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-7-48</ui>
   <ji>1471-2164</ji>
   <fm>
		<dochead>Database</dochead>
		<bibl>
			<title>
				<p>NovelFam3000 &#8211; Uncharacterized human protein domains conserved across model organisms</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Kemmer</snm>
					<fnm>Danielle</fnm>
					<insr iid="I1"/>
					<email>danielle@cmmt.ubc.ca</email>
				</au>
				<au id="A2">
					<snm>Podowski</snm>
					<mi>M</mi>
					<fnm>Raf</fnm>
					<insr iid="I1"/>
					<email>rpodowski@cmmt.ubc.ca</email>
				</au>
				<au id="A3">
					<snm>Arenillas</snm>
					<fnm>David</fnm>
					<insr iid="I2"/>
					<email>dave@cmmt.ubc.ca</email>
				</au>
				<au id="A4">
					<snm>Lim</snm>
					<fnm>Jonathan</fnm>
					<insr iid="I2"/>
					<email>jlim@cmmt.ubc.ca</email>
				</au>
				<au id="A5">
					<snm>Hodges</snm>
					<fnm>Emily</fnm>
					<insr iid="I1"/>
					<email>emily.hodges@ki.se</email>
				</au>
				<au id="A6">
					<snm>Roth</snm>
					<fnm>Peggy</fnm>
					<insr iid="I3"/>
					<email>peggy.roth@devbio.su.se</email>
				</au>
				<au id="A7">
					<snm>Sonnhammer</snm>
					<mi>LL</mi>
					<fnm>Erik</fnm>
					<insr iid="I1"/>
					<email>Erik.Sonnhammer@cgb.ki.se</email>
				</au>
				<au id="A8">
					<snm>H&#246;&#246;g</snm>
					<fnm>Christer</fnm>
					<insr iid="I1"/>
					<email>christer.hoog@ki.se</email>
				</au>
				<au id="A9" ca="yes">
					<snm>Wasserman</snm>
					<mi>W</mi>
					<fnm>Wyeth</fnm>
					<insr iid="I2"/>
					<insr iid="I4"/>
					<email>wyeth@cmmt.ubc.ca</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden</p>
				</ins>
				<ins id="I2">
					<p>Centre for Molecular Medicine and Therapeutics, University of British Columbia, Vancouver, Canada</p>
				</ins>
				<ins id="I3">
					<p>Department of Developmental Biology, Stockholm University, Stockholm, Sweden</p>
				</ins>
				<ins id="I4">
					<p>Department of Medical Genetics, University of British Columbia, Vancouver, Canada</p>
				</ins>
			</insg>
			<source>BMC Genomics</source>
			<issn>1471-2164</issn>
			<pubdate>2006</pubdate>
			<volume>7</volume>
			<issue>1</issue>
			<fpage>48</fpage>
			<url>http://www.biomedcentral.com/1471-2164/7/48</url>
			<xrefbib>
				<pubidlist>
					<pubid idtype="pmpid">16533400</pubid>
					<pubid idtype="doi">10.1186/1471-2164-7-48</pubid>
				</pubidlist>
			</xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>18</day>
					<month>11</month>
					<year>2005</year>
				</date>
			</rec>
			<acc>
				<date>
					<day>13</day>
					<month>3</month>
					<year>2006</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>13</day>
					<month>3</month>
					<year>2006</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2006</year>
			<collab>Kemmer et al; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins.</p>
				</sec>
				<sec>
					<st>
						<p>Description</p>
					</st>
					<p>From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families.</p>
				</sec>
			</sec>
		</abs>
	</fm>
   <meta>
		<classifications>
			<classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
		</classifications>
	</meta>
   <bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>The number of protein-encoding human genes identified has reached a plateau <abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>, leaving researchers with the challenging task of ascribing biochemical function(s) for each protein <abbrgrp>
					<abbr bid="B2">2</abbr>
				</abbrgrp>. Broad genome sequencing and functional genomics studies, partially motivated by the goal to discover the functions of uncharacterized proteins, have provided a distributed set of data collections suitable to catalyze the inference of the functions of proteins. While gene predictions and high-throughput genomics data can be of variable quality, studies have demonstrated that consistent results for interactions between homologous genes in multiple organisms, so called Interolog Analysis, can be more reliable <abbrgrp>
					<abbr bid="B3">3</abbr>
					<abbr bid="B4">4</abbr>
					<abbr bid="B5">5</abbr>
				</abbrgrp>. Therefore, human protein characterization efforts that focus on similar proteins across multiple organisms are expected to more effectively capitalize on the available genomics data.</p>
			<p>The genome sequence annotation and functional genomics data of <it>Caenorhabditis elegans</it>, <it>Drosophila melanogaster</it>, and <it>Homo sapiens </it>(hereafter referred to as worm, fly, and human) provide the basis for the study of proteins conserved across metazoan species. In pursuing comparative genomics approaches for functional inference of protein function, the initial selection of related proteins separated by great evolutionary distances can be a challenge. A decision must often be drawn between the study of homologous and orthologous proteins. In addition to technical difficulties and controversies that can arise in ortholog identification, a conservative focus on the study of orthologs greatly limits the number of proteins available to study. For homolog studies, grouping full-length protein sequences by similarity is not always feasible. The modular evolution of proteins presents a systematic complication &#8211; unrelated pairs of proteins can be linked through additional proteins sharing a domain with each pair (e.g. a protein with domains A and B may be linked to a protein with domains C and D via an intermediary protein with domains B and C). This problem is ameliorated by placing the focus on modular protein domain families, in which proteins are linked by the presence of a common domain <abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. Resources are well established which describe protein domain families, including such examples as Pfam, InterPro, and Panther <abbrgrp>
					<abbr bid="B7">7</abbr>
					<abbr bid="B8">8</abbr>
					<abbr bid="B9">9</abbr>
				</abbrgrp>. Those domains observed in proteins from multiple species are likely to be most reliable <abbrgrp>
					<abbr bid="B10">10</abbr>
				</abbrgrp>.</p>
			<p>Characterization of protein function remains a fundamental challenge in functional genomics research. We have created the NovelFam3000 data centre to accelerate the study of uncharacterized domains conserved across worm, fly, and human. Building on domains identified in Pfam <abbrgrp>
					<abbr bid="B7">7</abbr>
				</abbrgrp>, we systematically link domain-containing proteins to functional genomics data in online databases. The NovelFam3000 system allows users to post both comments and experimental data. For a selected subset of the uncharacterized domain-containing families, we generate and post expression profiles and proteomic sub-cellular localization images. Specific examples are presented showing how a combination of experimental approaches and bioinformatics resources may elucidate functional characteristics of uncharacterized domains.</p>
		</sec>
		<sec>
			<st>
				<p>Construction and content</p>
			</st>
			<sec>
				<st>
					<p>Selection of uncharacterized domain families</p>
				</st>
				<p>The characterization state of each protein domain is dynamic, dependent both on the available experimental literature and the perspective of the observing scientist. Using the Pfam database <abbrgrp>
						<abbr bid="B7">7</abbr>
					</abbrgrp>, we extracted approximately 3000 protein domain families for which we judged minimal biochemical annotation to be available (hence the name NovelFam3000). We limited our search to protein families present in genes from three metazoan genomes (worm, fly, and human), for which there were multiple human protein members. Applying these criteria, we extracted 2785 Pfam-B domain families and 127 families of Domains of Unknown Function (DUFs). The Pfam-B and DUF classes are distinguished by the level of human curation, as Pfam-B domains represent purely computational analysis and DUFs have been subjected to curator review. Of these domains, 892 (32%) of selected Pfam-B domains and 59 (46%) selected DUFs included at least one yeast protein member.</p>
			</sec>
			<sec>
				<st>
					<p>NovelFam3000 system overview</p>
				</st>
				<p>For the selected domains, we constructed a database and an annotation system that unites links to bioinformatics resources with user-submitted experimental data to accelerate inference of domain function <abbrgrp>
						<abbr bid="B11">11</abbr>
					</abbrgrp> (Figure <figr fid="F1">1</figr>) [see <supplr sid="S1">Additional file 1</supplr>]. Users may query the database either with identifiers (for genes or domains) or sequence. A submitted protein sequence is analyzed with a Hidden Markov Model (HMM) search <abbrgrp>
						<abbr bid="B12">12</abbr>
					</abbrgrp> to identify matches to DUFs included in NovelFam3000. Since there are currently no HMM models for Pfam-B domains, BLAST <abbrgrp>
						<abbr bid="B13">13</abbr>
					</abbrgrp> searches are performed against the ProDom protein sequence database <abbrgrp>
						<abbr bid="B14">14</abbr>
						<abbr bid="B15">15</abbr>
					</abbrgrp>, and the detected ProDom identifiers are mapped to corresponding Pfam-B accessions. For clarity, Pfam-B domains are derived from a subset of domains present in the ProDom database. The HMM-detected DUF matches and the BLAST-detected Pfam-B matches are displayed as search results. Based on the input, the user is taken to a "domain page" from which all reported family members can be perused.</p>
				<suppl id="S1">
					<title>
						<p>Additional File 1</p>
					</title>
					<text>
						<p>Supplementary Figure. This drawing represents the NovelFam3000 database schema. Each rectangle, labeled with the table name at the top, represents a table in the database. The field names for each table are listed with symbols to the left. Primary Keys are denoted by a yellow key. Foreign Keys are denoted by a red diamond and "(FK)" after the field name. Regular Fields are denoted by a blue diamond. Relations between the tables are indicated by blue lines, with the diamond-end of the line at the referenced table and the other end at the referencing table. The relations are as follows: Rel_01: Comments can be made about a gene; Rel_02: News items can be associated with a gene; Rel_03: Resources can be associated with a gene; Rel_04: Experiments can be associated with a gene; Rel_05: Pfam sequences can be associated with a gene; Rel_06: Pfam sequences can be associated with a Pfam family; Rel_07: An experiment can be associated with multiple instances of experimental data.</p>
					</text>
					<file name="1471-2164-7-48-S1.eps">
						<p>Click here for file</p>
					</file>
				</suppl>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Screenshots of the NovelFam3000 interface</p>
					</caption>
					<text>
						<p>Screenshots of the NovelFam3000 interface. (a) The protein domain page displays protein members across model organisms for a given protein domain family. Detailed information about each gene is available via various hyperlinks including news, resource links, experimental evidence, and comments. Each link takes the user to a separate page with available link-specific information and the option to submit new data and comments. (b) The resource link displays hyperlinks to a set of bioinformatics databases containing information for a specific domain family member. For this human example, resources are divided into genomic resources, molecular interactions, and protein networks and resources. The user has the option to submit new resource links to the system.</p>
					</text>
					<graphic file="1471-2164-7-48-1"/>
				</fig>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Utility</p>
			</st>
			<sec>
				<st>
					<p>The NovelFam3000 annotation system</p>
				</st>
				<p>Users may view and post detailed information about each gene using four categories: i) resource links, linking to major bioinformatics resources, ii) news, highlighting the latest annotations submitted to the system, iii) comments, giving users the opportunity to view and post general comments regarding the domain-containing protein of interest, and iv) experimental evidence, displaying results submitted by individual researchers. At the bottom of each page displaying gene-specific information for one of the four categories, the user is prompted to submit new information. Submitted data are rendered accessible through the system within 24 hours, after brief editorial review to confirm relevance (i.e. to prevent posting of unrelated material).</p>
				<p>For each gene, links to both diverse external resources and user-submitted experimental results and comments are provided via a "gene page". Organism-centric resource links for each gene include WormBase <abbrgrp>
						<abbr bid="B16">16</abbr>
					</abbrgrp>, Flybase <abbrgrp>
						<abbr bid="B17">17</abbr>
					</abbrgrp>, and SGD <abbrgrp>
						<abbr bid="B18">18</abbr>
					</abbrgrp>. For human proteins, links are provided to genome browsers <abbrgrp>
						<abbr bid="B19">19</abbr>
						<abbr bid="B20">20</abbr>
					</abbrgrp> and the meta-database GeneLynx <abbrgrp>
						<abbr bid="B21">21</abbr>
					</abbrgrp>. For each protein, we provide links to the Biomolecular Interaction Network Database (BIND) <abbrgrp>
						<abbr bid="B22">22</abbr>
					</abbrgrp>, as well as to the Interolog Analysis system Ulysses <abbrgrp>
						<abbr bid="B5">5</abbr>
						<abbr bid="B23">23</abbr>
					</abbrgrp> that displays protein-protein interactions observed for homologous proteins across fly, worm, human, and yeast.</p>
				<p>Within the NovelFam3000 system, we report the Gene Characterization Index (GCI) for each human gene, providing users with a measure of the current knowledge of the gene's function. GCI scores assign a continuous score in the range of one (uncharacterized) to ten (fully characterized). The GCI system (Podowski <it>et al</it>., in preparation) is based on the results of a global survey of research biologists. Each participating scientist was given a sample of ten genes and returned their opinion as to the characterization status. The survey covered a total set of 100 genes with at least three fold redundancy. A machine learning procedure was used to create a scoring function to automatically predict the GCI score for all genes in the human genome. In this step, a Support Vector Machine was trained based on the survey results as training data, and the number of links to common databases (e.g. links to abstracts in PubMed or domains in Pfam).</p>
				<p>Both the gene-specific news and user comment features allow researchers to highlight recent publications and observations. The experimental evidence pages enable the user to view and submit experimental results for individual proteins. The option to post and view comments related to protein-specific information forms a basis for a general discussion forum and motivates scientific exchange and discussion between researchers.</p>
			</sec>
			<sec>
				<st>
					<p>Posting of laboratory results to the NovelFam3000 system</p>
				</st>
				<sec>
					<st>
						<p>Selection of sample set of genes</p>
					</st>
					<p>To demonstrate the capacity of the NovelFam3000 system to facilitate the inference of protein domain functions, we selected a set of 39 domain families for targeted experimental studies (Table <tblr tid="T1">1</tblr>). For 25 genes belonging to the target domain families, we confirmed expression in a panel of cell lines, cloned full-length cDNAs, and performed sub-cellular localization analysis [see <supplr sid="S2">Additional file 2</supplr>].</p>
					<suppl id="S2">
						<title>
							<p>Additional File 2</p>
						</title>
						<text>
							<p>Materials and methods. This file contains detailed materials and methods for both the database implementation and the experimental data</p>
						</text>
						<file name="1471-2164-7-48-S2.doc">
							<p>Click here for file</p>
						</file>
					</suppl>
					<tbl id="T1">
						<title>
							<p>Table 1</p>
						</title>
						<caption>
							<p>List of selected domain family members for experimental validation</p>
						</caption>
						<tblbdy cols="4">
							<r>
								<c ca="left">
									<p>
										<b>Gene name</b>
									</p>
								</c>
								<c ca="center">
									<p>
										<b>GeneLynx</b>
									</p>
								</c>
								<c ca="center">
									<p>
										<b>Pfam domain family</b>
									</p>
								</c>
								<c ca="center">
									<p>
										<b>Domain name</b>
									</p>
								</c>
							</r>
							<r>
								<c cspan="4">
									<hr/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NP_057480 (HSPC129)</p>
								</c>
								<c ca="center">
									<p>7247</p>
								</c>
								<c ca="center">
									<p>PF03031</p>
								</c>
								<c ca="center">
									<p>NIF</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>CTDSPL</p>
								</c>
								<c ca="center">
									<p>7767</p>
								</c>
								<c ca="center">
									<p>PF03031</p>
								</c>
								<c ca="center">
									<p>NIF</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>DULLARD</p>
								</c>
								<c ca="center">
									<p>3595</p>
								</c>
								<c ca="center">
									<p>PF03031</p>
								</c>
								<c ca="center">
									<p>NIF</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>SH3GL1</p>
								</c>
								<c ca="center">
									<p>5401</p>
								</c>
								<c ca="center">
									<p>PF03114</p>
								</c>
								<c ca="center">
									<p>BAR</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>SH3BP1</p>
								</c>
								<c ca="center">
									<p>14928</p>
								</c>
								<c ca="center">
									<p>PF03114</p>
								</c>
								<c ca="center">
									<p>BAR</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>PAPD1(FLJ10486)</p>
								</c>
								<c ca="center">
									<p>13623</p>
								</c>
								<c ca="center">
									<p>PF03828</p>
								</c>
								<c ca="center">
									<p>PAP_assoc</p>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB001357</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>BRIX_HUMAN</p>
								</c>
								<c ca="center">
									<p>13847</p>
								</c>
								<c ca="center">
									<p>PF04427</p>
								</c>
								<c ca="center">
									<p>Brix</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>IMP4</p>
								</c>
								<c ca="center">
									<p>3684</p>
								</c>
								<c ca="center">
									<p>PF04427</p>
								</c>
								<c ca="center">
									<p>Brix</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NP_115675 (MGC2714)</p>
								</c>
								<c ca="center">
									<p>22343</p>
								</c>
								<c ca="center">
									<p>PF03556</p>
								</c>
								<c ca="center">
									<p>DUF298</p>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB003548</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>SUPT16H (MGC48972)</p>
								</c>
								<c ca="center">
									<p>10874</p>
								</c>
								<c ca="center">
									<p>PB025336</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB006003</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB005785</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB005762</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB005265</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NP_055268 (CHMP2A, BC-2)</p>
								</c>
								<c ca="center">
									<p>8520</p>
								</c>
								<c ca="center">
									<p>PF03357</p>
								</c>
								<c ca="center">
									<p>SNF7</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>VPS24</p>
								</c>
								<c ca="center">
									<p>11651</p>
								</c>
								<c ca="center">
									<p>PF03357</p>
								</c>
								<c ca="center">
									<p>SNF7</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>DYM (FLJ20071)</p>
								</c>
								<c ca="center">
									<p>11668</p>
								</c>
								<c ca="center">
									<p>PB011701</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB013707</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB035957</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>KBTBD10 (SARCOSIN)</p>
								</c>
								<c ca="center">
									<p>364</p>
								</c>
								<c ca="center">
									<p>PF07707</p>
								</c>
								<c ca="center">
									<p>BACK</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>BTBD1</p>
								</c>
								<c ca="center">
									<p>13364</p>
								</c>
								<c ca="center">
									<p>PB006072</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NP_060536 (FLJ10349)</p>
								</c>
								<c ca="center">
									<p>13579</p>
								</c>
								<c ca="center">
									<p>PF03029</p>
								</c>
								<c ca="center">
									<p>ATP_bind_1</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NP_057385 (Protein &#215; 0004)</p>
								</c>
								<c ca="center">
									<p>12221</p>
								</c>
								<c ca="center">
									<p>PF03029</p>
								</c>
								<c ca="center">
									<p>ATP_bind_1</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>NCOA7</p>
								</c>
								<c ca="center">
									<p>24012</p>
								</c>
								<c ca="center">
									<p>PF07534</p>
								</c>
								<c ca="center">
									<p>TLD</p>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB007099</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>SBNO1</p>
								</c>
								<c ca="center">
									<p>13705</p>
								</c>
								<c ca="center">
									<p>PB012709</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB006622</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB007895</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB008212</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB008801</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB017731</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB017508</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>PLCG2</p>
								</c>
								<c ca="center">
									<p>3287</p>
								</c>
								<c ca="center">
									<p>PB010400</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>PELI1_HUMAN</p>
								</c>
								<c ca="center">
									<p>17429</p>
								</c>
								<c ca="center">
									<p>PF04710</p>
								</c>
								<c ca="center">
									<p>Pellino</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>MCTS1</p>
								</c>
								<c ca="center">
									<p>11883</p>
								</c>
								<c ca="center">
									<p>PF01472</p>
								</c>
								<c ca="center">
									<p>PUA</p>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB003960</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>FAM49B (BM-009)</p>
								</c>
								<c ca="center">
									<p>11348</p>
								</c>
								<c ca="center">
									<p>PF07159</p>
								</c>
								<c ca="center">
									<p>DUF1394</p>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>ANKRD13</p>
								</c>
								<c ca="center">
									<p>9528</p>
								</c>
								<c ca="center">
									<p>PB004630</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c ca="left">
									<p>RABGAP1L (HHL protein, EVI-5 homolog)</p>
								</c>
								<c ca="center">
									<p>14943</p>
								</c>
								<c ca="center">
									<p>PF00566</p>
								</c>
								<c ca="center">
									<p>TBC</p>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB008569</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB010722</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB011629</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
							<r>
								<c>
									<p/>
								</c>
								<c>
									<p/>
								</c>
								<c ca="center">
									<p>PB008876</p>
								</c>
								<c>
									<p/>
								</c>
							</r>
						</tblbdy>
					</tbl>
				</sec>
				<sec>
					<st>
						<p>Sub-cellular localization</p>
					</st>
					<p>The function of proteins is, in part, defined by the cellular compartment within which they reside. Sub-cellular localization can be determined by visualization of recombinant proteins in amenable cell lines <abbrgrp>
							<abbr bid="B24">24</abbr>
							<abbr bid="B25">25</abbr>
						</abbrgrp>. We initiated sub-cellular localization by verifying that a set of predicted human genes were endogenously expressed in human cells. For this purpose, we screened the expression of the 25 selected genes in three human cell lines by reverse transcription polymerase chain reaction (RT-PCR) analysis [see <supplr sid="S3">Additional file 3</supplr>]. The human cell lines, chosen for their suitability for microscopy studies, included the hepatocarcinoma cell line PLC/PRF/5, the glia cell line U333CG/343 MG, and the fibroblast line HF-SV80. Of the 25 candidate genes, 20 were expressed in all three cell lines, three were found to be expressed in two of the three cell lines, and transcripts for two genes were only detected in a single cell line. These observations confirmed the physiological expression of predicted human genes. For sub-cellular screening, full-length human cDNAs were amplified from mRNA and cloned in-frame with an N-terminal FLAG tag. The 25 cloned, FLAG epitope-tagged recombinant proteins were analyzed by immunofluorescence microscopy. Individual transfection of each construct into mammalian cells followed by expression and immunolocalization with monoclonal FLAG-specific antibodies revealed sub-cellular localization of the fusion proteins.</p>
					<suppl id="S3">
						<title>
							<p>Additional File 3</p>
						</title>
						<text>
							<p>Primer sequences. This file contains nucleotide sequences for the gene-specific primers used for RT-PCR amplification of predicted human genes</p>
						</text>
						<file name="1471-2164-7-48-S3.doc">
							<p>Click here for file</p>
						</file>
					</suppl>
					<p>We performed an initial screen to distinguish between cytoplasmic and nuclear localization. This initial classification was followed by counterstaining experiments with multiple sub-cellular markers. Each marker was specific to a sub-cellular compartment, thus facilitating the refined interpretation of previously determined coarse staining patterns. During the primary analysis, we observed six fusion proteins localized to the nucleus, nine proteins in the cytoplasm and six proteins appeared diffusely distributed over the entire cell. Four of the recombinant proteins did not give rise to any detectable staining pattern. All constructs were expressed in the three cell lines to confirm that the observed localization pattern was identical between transfections with the same construct irrespective of the cell type. In the second round of screening, this time limited to PLC/PRF/5 cells, we re-transfected those constructs that had previously given rise to distinct cellular localization patterns, and stained using either antibody markers or specific dyes for cellular structures to confirm co-localization.</p>
					<p>All of the expression data and microscopy images from the sub-cellular localization profiling were posted through the laboratory results service of the NovelFam3000 system.</p>
				</sec>
				<sec>
					<st>
						<p>Inference of potential domain properties</p>
					</st>
					<p>Within the targeted domain families, we sought to identify intra-family consistencies.</p>
					<p>For protein domain family PF004427 (Brix domain), human proteins BRIX_HUMAN and IMP4 localized to the nucleoli (Figure <figr fid="F2">2</figr>). The localization was confirmed by complete co-localization with fibrillarin, a nucleolar-specific marker. To test if additional Brix domain family members localize to the nucleolus and to confirm consistency in sub-cellular localization across organisms, we isolated three <it>Drosophila </it>homologs of this domain family (CG32253, CG11920, and CG6712). The cDNAs of the fly genes were cloned into both fly and mammalian expression vectors. All three fly proteins localized to the nucleus in fly cells displaying a consistent nucleolar staining pattern (CG6712 not shown). The expression of the <it>Drosophila </it>proteins in human HEK 293 cells was monitored by a C-terminal in-frame GFP tag (CG11920 not shown). The <it>Drosophila </it>proteins were found to accumulate in the nucleoli of the human cells, suggesting that the evolutionary conserved protein domain might be implicated in the targeting of these proteins to nucleoli. These results complement published observations for family members in model organisms <abbrgrp>
							<abbr bid="B26">26</abbr>
						</abbrgrp>, and in total, suggest that the proteins with the domain perform specific functions in the nucleoli.</p>
					<fig id="F2">
						<title>
							<p>Figure 2</p>
						</title>
						<caption>
							<p>Immunolocalization of protein domain family PF04427 members</p>
						</caption>
						<text>
							<p>Immunolocalization of protein domain family PF04427 members. Human FLAG-tagged recombinant proteins detected with FLAG-specific antibodies in human PLC/PRF/5 cells (a: BRIX_HUMAN; e: IMP4). Fly His-tagged recombinant proteins detected with His-specific antibodies in fly cells (i: CG11920: l: CG32253). Fly GFP-tagged recombinant proteins detected in human HEK293 cells (n: CG32253; r: CG6712). Fibrillarin staining of nucleoli (b, f, o, s); composite images between proceeding stains (c, g, p, t); DAPI staining of nuclei (d, h, k, m, q, u).</p>
						</text>
						<graphic file="1471-2164-7-48-2"/>
					</fig>
					<p>Intra-family consistency was also observed for protein domain family PF03114 (BAR domain). Member proteins SH3BP1 and SH3GL1 both localized to cytoplasmic vesicles which appeared to merge with the cellular membrane forming protrusions (Figure <figr fid="F3">3</figr>). The familial consistency in the staining patterns observed suggests that the BAR domain is linked to vesicle transport and/or metabolism. The conserved domain might be part of a localization signal that directs the proteins to the observed locations.</p>
					<fig id="F3">
						<title>
							<p>Figure 3</p>
						</title>
						<caption>
							<p>Immunolocalization of protein domain family PF03114 members</p>
						</caption>
						<text>
							<p>Immunolocalization of protein domain family PF03114 members. FLAG-tagged recombinant proteins detected with FLAG-specific antibodies (a: SH3GL1; c: SH3BP1); DAPI staining of nuclei (b, d). Both proteins show similar vesicular localization patterns. The pattern is distinct from those obtained with multiple cellular markers for vesicle structures.</p>
						</text>
						<graphic file="1471-2164-7-48-3"/>
					</fig>
					<p>We observed an example in which members of the same domain family displayed different, distinct cellular localization patterns (Figure <figr fid="F4">4</figr>). Over-expression of NP_057480 (HSPC129) of domain family PF03031 (NLI interacting factor-like phosphatase domain) in PLC/PRF/5 cells gave a clear and strong staining of the nuclear envelope presenting budding structures. DULLARD of that same domain family displayed a cytoplasmic staining pattern localizing to the endoplasmatic reticulum (ER), as confirmed by calnexin counterstaining. Furthermore, family member CTDSPL co-localized with MitoTracker, a mitochondrion-specific cell-permeant fluorescent dye. These results indicate that the function of this domain is not linked to a specific sub-cellular location.</p>
					<fig id="F4">
						<title>
							<p>Figure 4</p>
						</title>
						<caption>
							<p>Diverse localization patterns observed by immunolocalization of FLAG-tagged recombinant proteins detected with FLAG-specific antibodies</p>
						</caption>
						<text>
							<p>Diverse localization patterns observed by immunolocalization of FLAG-tagged recombinant proteins detected with FLAG-specific antibodies. NP_057480 (HSPC129) of protein domain family PF03031 localizes to the nuclear rim (a), DULLARD is found in the ER (c) as confirmed by calnexin counterstaining (d, e), and CTDSPL is present in mitochondria (g) as confirmed by staining with MitoTracker Red CMX dye (h, i). DAPI staining for nuclei (b, f, k).</p>
						</text>
						<graphic file="1471-2164-7-48-4"/>
					</fig>
				</sec>
				<sec>
					<st>
						<p>Combining results from multiple sources via NovelFam3000</p>
					</st>
					<p>NP_055268 (CHMP2A, BC-2), a member of protein family PF03357 (SNF7 domain, previously DUF279), gave rise to a unique cytoplasmic staining pattern (Figure <figr fid="F5">5</figr>). We tested hypothetical co-localization with the golgi, the ER, and mitochondria by counterstaining using corresponding markers (data not shown), but could not attribute NP_055268 (CHMP2A, BC-2)'s pattern to any previously defined sub-cellular location. Linking from NovelFam3000 to the Ulysses system, conserved networks in the model organisms suggest that NP_055268 (CHMP2A, BC-2) is a protein involved in pre-vacuolar endosome protein sorting and transport, a hypothesis supported by a previous study <abbrgrp>
							<abbr bid="B27">27</abbr>
						</abbrgrp>. CHMP2A has also been shown to be expressed in the nucleus, possibly having a role in gene silencing <abbrgrp>
							<abbr bid="B28">28</abbr>
						</abbrgrp>. This dual expression pattern is reminiscent of the expression pattern of a related gene, CHMP1, that has been postulated to have a cytoplasmic role in vesicle trafficking, but also a role within the nuclear matrix <abbrgrp>
							<abbr bid="B29">29</abbr>
							<abbr bid="B30">30</abbr>
						</abbrgrp>.</p>
					<fig id="F5">
						<title>
							<p>Figure 5</p>
						</title>
						<caption>
							<p>Novel immunolocalization pattern of FLAG-tagged recombinant protein detected with FLAG-specific antibodies (a)</p>
						</caption>
						<text>
							<p>Novel immunolocalization pattern of FLAG-tagged recombinant protein detected with FLAG-specific antibodies (a). DAPI staining for nuclei (b). NP_055268 (CHMP2A, BC-2) of protein family PF03357 forms distinct cytoplasmic structures in PLC/PRF/5 cells.</p>
						</text>
						<graphic file="1471-2164-7-48-5"/>
					</fig>
					<p>In addition to the analysis of paralogous human genes (derived by duplication), similarities between family members can be considered across species (orthologs analysis). For those selected proteins present in yeast, we extracted and reviewed sub-cellular localization and interacting protein partners. We show in two examples how the integration of functional data from studies of homologous yeast proteins reveals the broad conservation of function.</p>
					<p>Yeast proteins containing the brix domain (PF04427) and their interacting partners have been localized to the nucleolus <abbrgrp>
							<abbr bid="B31">31</abbr>
						</abbrgrp>. Imp4p is a specific component of the U3 snoRNP and is required for pre-18S rRNA processing. Brx1p is implicated in the biogenesis of the 60S ribosomal subunit. The functional differences of human homologs, BRIX_HUMAN and IMP4, are reflected in their observed nucleolar, yet distinct localization patterns (Figure <figr fid="F2">2</figr>).</p>
					<p>Protein localization and interaction data from yeast studies complement the observed localization of human NP_057480 (HSPC129) and DULLARD, both from protein family PF03031 (Figure <figr fid="F4">4</figr>). A yeast homolog containing the NIF domain, nem1, is described as a trans-membrane protein localizing to the membranes of the ER and the nucleus <abbrgrp>
							<abbr bid="B32">32</abbr>
						</abbrgrp>. Nem1's specific molecular function is unknown. Protein interaction studies with nem1 have identified three interacting partners (nup84, nup85, nup120), all components of the yeast nuclear pore complex (NPC) <abbrgrp>
							<abbr bid="B33">33</abbr>
						</abbrgrp>. Despite the strong links to the NPC and the localization to the nuclear membrane, we are not convinced that NP_057480 (HSPC129) is a direct component of the vertebrate NPC, since its nuclear rim staining does not show a punctuate pattern &#8211; a general feature of NPC elements <abbrgrp>
							<abbr bid="B34">34</abbr>
						</abbrgrp>. Based on the consistency among yeast network members, we identified the human orthologs for the interacting partners. Human NUP107 (related to yeast nup84) supports the NPC link, as this protein is required for the assembly of a subset of "Nup" proteins into the NPC <abbrgrp>
							<abbr bid="B35">35</abbr>
						</abbrgrp>. From the analysis of NP_057480 (HSPC129), its homologs and interacting partners, we hypothesize that this protein is an uncharacterized NPC-associated protein.</p>
				</sec>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion and conclusion</p>
			</st>
			<p>Based on comparative genome analysis across multiple organisms, protein families have been identified containing domains for which minimal functional annotation is available. From the Pfam database <abbrgrp>
					<abbr bid="B7">7</abbr>
				</abbrgrp> we extracted uncharacterized domain families conserved across vast evolutionary distance, suggesting a well-defined and important cellular role. To elucidate the cellular function of individual proteins, we created the NovelFam3000 system to integrate links to diverse resources, provide an interface for scientific discourse and comments, and house relevant experimental data. As a demonstration, we explored the properties of several domain families and used the NovelFam3000 system to develop data-based inferences.</p>
			<p>Existing data mining tools <abbrgrp>
					<abbr bid="B36">36</abbr>
					<abbr bid="B37">37</abbr>
					<abbr bid="B38">38</abbr>
				</abbrgrp> collect information and provide ample annotation for predicted genes and gene products from scattered resources. These tools are generally species-specific or concentrate on specific gene properties such as gene expression <abbrgrp>
					<abbr bid="B39">39</abbr>
				</abbrgrp> or gene associations <abbrgrp>
					<abbr bid="B40">40</abbr>
				</abbrgrp>. The NovelFam3000 system is a powerful tool for internet-based information exchange and is unique in its focus on active community participation.</p>
			<p>There are aspects of the NovelFam3000 system which are reminiscent of the popular WIKI group communication systems <abbrgrp>
					<abbr bid="B41">41</abbr>
				</abbrgrp>. In comparison, the BioWiki project <abbrgrp>
					<abbr bid="B42">42</abbr>
				</abbrgrp> promises to provide a system for shared content editing, which may be well suited for ontology development projects. While WIKI systems are predicated on user editing of posted information, NovelFam3000 was implemented without the community editing functions, as laboratory data should only be subject to corrections from the source investigator. However, NovelFam3000 does allow for critiques to be posted related to experimental results (subject to editorial review to insure the relevance of postings). In combining a WIKI-like interface with a broad collection of hyperlinks to gene-centric and experimental databases, NovelFam3000 is a unique tool to facilitate inference of protein domain functions.</p>
			<p>As the structure of the NovelFam3000 data centre is suitable for any number of projects predicated on the collaborative analysis of sets of genes, the underlying software has been made available on the website &#8211; provided as an open-source program with no restrictions on the use or redistribution of the code. Already the software has been revised for use in a large genomics project (the Pleiades Project <abbrgrp>
					<abbr bid="B43">43</abbr>
				</abbrgrp>), with only modest software revision required. Thus, the NovelFam3000 software stands as an important product of this research effort.</p>
			<p>We populated the NovelFam3000 experimental data service with an initial panel of results for 25 genes from 39 domain families. The transcripts were detected by RT-PCR and cloned, confirming active transcription. We assigned the proteins to distinct sub-cellular compartments by epitope tagging followed by immunolocalization of the fusion proteins. Consistent localization across members of a protein domain family suggests that the function of the domain is directly linked to location. In some cases, the experimental localization data was complemented by the properties of interacting partners of model organism family members.</p>
			<p>The race to functional annotation runs at full speed and the level of cellular characterization of genes is constantly changing. The Gene Characterization Index score displayed in NovelFam3000 provides a dynamic indicator of the status of annotation for each gene. As upward shifts in GCI scores are indicative of advances in the elucidation of the functions of genes in NovelFam3000, dramatic changes will be highlighted on the homepage of the system.</p>
			<p>The NovelFam3000 system facilitates community-based curation of gene information.</p>
		</sec>
		<sec>
			<st>
				<p>Availability</p>
			</st>
			<p>NovelFam3000 is publicly available and can be accessed at <url>http://www.cisreg.ca/novelfam3000/</url>. The NovelFam3000 software is available for download without restrictions on the website.</p>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>DK participated in the design of the study and generated experimental data. DK, CH, and WWW drafted the manuscript. RMP developed the Gene Characterization Index and contributed to the database design. DA and JL carried out the database development. EH and PR contributed experimental data for model organisms. CH coordinated the generation of the experimental data for the database. ELLS supervised the initial compilation of a collection of novel-domain containing proteins. WWW conceived of the NovalFam3000 database and assisted in the interface design. All authors read and approved of the final manuscript.</p>
		</sec>
	</bdy>
   <bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>Thanks to Mark Gurney who helped shape our approach to the study of novelty in the human genome. We are grateful for support and suggestions from Claes Wahlestedt, Luis Parodi, Ismail Kola, Michael Hsing, Qiaolin Deng, and Lars Arvestad. This project was funded with financial support from the Pharmacia Corp. to the Center for Genomics andBioinformatics, and the software development was partially supported by funds from Merck-Frost to the Centre for Molecular Medicine and Therapeutics. W.W.W. acknowledges the support of the Canadian Institutes of Health Research and the Michael Smith Foundation for Health Research.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Has the yo-yo stopped? An assessment of human protein-coding gene number</p>
				</title>
				<aug>
					<au>
						<snm>Southan</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Proteomics</source>
				<pubdate>2004</pubdate>
				<volume>4</volume>
				<fpage>1712</fpage>
				<lpage>1726</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/pmic.200300700</pubid>
						<pubid idtype="pmpid" link="fulltext">15174140</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Annotating the human proteome</p>
				</title>
				<aug>
					<au>
						<snm>Orchard</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hermjakob</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Mol Cell Proteomics</source>
				<pubdate>2005</pubdate>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15691850</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>A Gene Coexpression Network for Global Discovery of Conserved Genetic Modules</p>
				</title>
				<aug>
					<au>
						<snm>Stuart</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Segal</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Koller</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>SK</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2003</pubdate>
				<volume>21</volume>
				<fpage>21</fpage>
			</bibl>
			<bibl id="B4">
				<title>
					<p>SGP-1: prediction and validation of homologous genes based on sequence alignments</p>
				</title>
				<aug>
					<au>
						<snm>Wiehe</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Gebauer-Jung</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Mitchell-Olds</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Guigo</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>1574</fpage>
				<lpage>1583</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">311140</pubid>
						<pubid idtype="pmpid" link="fulltext">11544202</pubid>
						<pubid idtype="doi">10.1101/gr.177401</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Ulysses - an application for the projection of molecular interactions across species</p>
				</title>
				<aug>
					<au>
						<snm>Kemmer</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Shah</snm>
						<fnm>SP</fnm>
					</au>
					<au>
						<snm>Lim</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Brumm</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Yuen</snm>
						<fnm>MM</fnm>
					</au>
					<au>
						<snm>Ling</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Wasserman</snm>
						<fnm>WW</fnm>
					</au>
					<au>
						<snm>Ouellette</snm>
						<fnm>BF</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<fpage>R106</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1186/gb-2005-6-12-r106</pubid>
						<pubid idtype="pmpid" link="fulltext">16356269</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Protein domain analysis in the era of complete genomes</p>
				</title>
				<aug>
					<au>
						<snm>Copley</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Doerks</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Letunic</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>FEBS Lett</source>
				<pubdate>2002</pubdate>
				<volume>513</volume>
				<fpage>129</fpage>
				<lpage>134</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0014-5793(01)03289-6</pubid>
						<pubid idtype="pmpid" link="fulltext">11911892</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>The Pfam protein families database</p>
				</title>
				<aug>
					<au>
						<snm>Bateman</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Coin</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Finn</snm>
						<fnm>RD</fnm>
					</au>
					<au>
						<snm>Hollich</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Griffiths-Jones</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Khanna</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Marshall</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Moxon</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sonnhammer</snm>
						<fnm>EL</fnm>
					</au>
					<au>
						<snm>Studholme</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Yeats</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32</volume>
				<fpage>D138</fpage>
				<lpage>41</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308855</pubid>
						<pubid idtype="pmpid" link="fulltext">14681378</pubid>
						<pubid idtype="doi">10.1093/nar/gkh121</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>The InterPro Database, 2003 brings increased coverage and new features</p>
				</title>
				<aug>
					<au>
						<snm>Mulder</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Attwood</snm>
						<fnm>TK</fnm>
					</au>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Barrell</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Bateman</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Binns</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Biswas</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Bradley</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bucher</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Copley</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Courcelle</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Das</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Falquet</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Fleischmann</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Griffiths-Jones</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Haft</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Harte</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Hulo</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Kahn</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kanapin</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Krestyaninova</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Letunic</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Lonsdale</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Silventoinen</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Orchard</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Pagni</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Peyruc</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Ponting</snm>
						<fnm>CP</fnm>
					</au>
					<au>
						<snm>Selengut</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Servant</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Sigrist</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Vaughan</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Zdobnov</snm>
						<fnm>EM</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>315</fpage>
				<lpage>318</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165493</pubid>
						<pubid idtype="pmpid" link="fulltext">12520011</pubid>
						<pubid idtype="doi">10.1093/nar/gkg046</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>The PANTHER database of protein families, subfamilies, functions and pathways</p>
				</title>
				<aug>
					<au>
						<snm>Mi</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Lazareva-Ulitsky</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Loo</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Kejariwal</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Vandergriff</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Rabkin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Guo</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Muruganujan</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Doremieux</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Kitano</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Thomas</snm>
						<fnm>PD</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33</volume>
				<fpage>D284</fpage>
				<lpage>8</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">540032</pubid>
						<pubid idtype="pmpid" link="fulltext">15608197</pubid>
						<pubid idtype="doi">10.1093/nar/gki078</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Comparative genomics of the eukaryotes</p>
				</title>
				<aug>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Yandell</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Wortman</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Gabor Miklos</snm>
						<fnm>GL</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Hariharan</snm>
						<fnm>IK</fnm>
					</au>
					<au>
						<snm>Fortini</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Fleischmann</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Cherry</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Henikoff</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Skupski</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Misra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ashburner</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Boguski</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Brody</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Brokstein</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Chervitz</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Coates</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Cravchik</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Gabrielian</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Galle</snm>
						<fnm>RF</fnm>
					</au>
					<au>
						<snm>Gelbart</snm>
						<fnm>WM</fnm>
					</au>
					<au>
						<snm>George</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Goldstein</snm>
						<fnm>LS</fnm>
					</au>
					<au>
						<snm>Gong</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Guan</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>NL</fnm>
					</au>
					<au>
						<snm>Hay</snm>
						<fnm>BA</fnm>
					</au>
					<au>
						<snm>Hoskins</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Hynes</snm>
						<fnm>RO</fnm>
					</au>
					<au>
						<snm>Jones</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Kuehl</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Lemaitre</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Littleton</snm>
						<fnm>JT</fnm>
					</au>
					<au>
						<snm>Morrison</snm>
						<fnm>DK</fnm>
					</au>
					<au>
						<snm>Mungall</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>O'Farrell</snm>
						<fnm>PH</fnm>
					</au>
					<au>
						<snm>Pickeral</snm>
						<fnm>OK</fnm>
					</au>
					<au>
						<snm>Shue</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Vosshall</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Zheng</snm>
						<fnm>XH</fnm>
					</au>
					<au>
						<snm>Lewis</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2000</pubdate>
				<volume>287</volume>
				<fpage>2204</fpage>
				<lpage>2215</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.287.5461.2204</pubid>
						<pubid idtype="pmpid" link="fulltext">10731134</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>The NovelFam3000 Data Center</p>
				</title>
				<url>http://www.cisreg.ca/novelfam3000/</url>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Profile hidden Markov models</p>
				</title>
				<aug>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>1998</pubdate>
				<volume>14</volume>
				<fpage>755</fpage>
				<lpage>763</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/14.9.755</pubid>
						<pubid idtype="pmpid" link="fulltext">9918945</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Basic local alignment search tool</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Gish</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1990</pubdate>
				<volume>215</volume>
				<fpage>403</fpage>
				<lpage>410</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.1990.9999</pubid>
						<pubid idtype="pmpid" link="fulltext">2231712</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>ProDom: automated clustering of homologous domains</p>
				</title>
				<aug>
					<au>
						<snm>Servant</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Bru</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Carrere</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Courcelle</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Gouzy</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Peyruc</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kahn</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Brief Bioinform</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>246</fpage>
				<lpage>251</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bib/3.3.246</pubid>
						<pubid idtype="pmpid" link="fulltext">12230033</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>ProDom</p>
				</title>
				<url>http://protein.toulouse.inra.fr/prodom.html</url>
			</bibl>
			<bibl id="B16">
				<title>
					<p>WormBase: a comprehensive data resource for Caenorhabditis biology and genomics</p>
				</title>
				<aug>
					<au>
						<snm>Chen</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>TW</fnm>
					</au>
					<au>
						<snm>Antoshechkin</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Bastiani</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Bieri</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Blasiar</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Bradnam</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Canaran</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Chan</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>CK</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Cunningham</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Kenny</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Kishore</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Lawson</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Muller</snm>
						<fnm>HM</fnm>
					</au>
					<au>
						<snm>Nakamura</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Pai</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ozersky</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Petcherski</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Rogers</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sabo</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Schwarz</snm>
						<fnm>EM</fnm>
					</au>
					<au>
						<snm>Van Auken</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Durbin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Spieth</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Sternberg</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Stein</snm>
						<fnm>LD</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33 Database Issue</volume>
				<fpage>D383</fpage>
				<lpage>9</lpage>
			</bibl>
			<bibl id="B17">
				<title>
					<p>FlyBase: genes and gene models</p>
				</title>
				<aug>
					<au>
						<snm>Drysdale</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Crosby</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Gelbart</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Emmert</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Matthews</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Russo</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Schroeder</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Smutniak</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Zytkovicz</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Ashburner</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>de Grey</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Foulger</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Millburn</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Sutherland</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Yamada</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kaufman</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Matthews</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>DeAngelo</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cook</snm>
						<fnm>RK</fnm>
					</au>
					<au>
						<snm>Gilbert</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Goodman</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Grumbling</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Sheth</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Strelets</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Lewis</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Misra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Shu</snm>
						<fnm>SQ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33 Database Issue</volume>
				<fpage>D390</fpage>
				<lpage>5</lpage>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms</p>
				</title>
				<aug>
					<au>
						<snm>Christie</snm>
						<fnm>KR</fnm>
					</au>
					<au>
						<snm>Weng</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Balakrishnan</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Costanzo</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Dolinski</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Dwight</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>Engel</snm>
						<fnm>SR</fnm>
					</au>
					<au>
						<snm>Feierbach</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Fisk</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Hirschman</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						<fnm>EL</fnm>
					</au>
					<au>
						<snm>Issel-Tarver</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Nash</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Sethuraman</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Starr</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Theesfeld</snm>
						<fnm>CL</fnm>
					</au>
					<au>
						<snm>Andrada</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Binkley</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Dong</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Lane</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Schroeder</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Botstein</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Cherry</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32 Database issue</volume>
				<fpage>D311</fpage>
				<lpage>4</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1093/nar/gkh033</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>The human genome browser at UCSC</p>
				</title>
				<aug>
					<au>
						<snm>Kent</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Sugnet</snm>
						<fnm>CW</fnm>
					</au>
					<au>
						<snm>Furey</snm>
						<fnm>TS</fnm>
					</au>
					<au>
						<snm>Roskin</snm>
						<fnm>KM</fnm>
					</au>
					<au>
						<snm>Pringle</snm>
						<fnm>TH</fnm>
					</au>
					<au>
						<snm>Zahler</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Haussler</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>996</fpage>
				<lpage>1006</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">186604</pubid>
						<pubid idtype="pmpid" link="fulltext">12045153</pubid>
						<pubid idtype="doi">10.1101/gr.229102. Article published online before print in May 2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>The Ensembl automatic gene annotation system</p>
				</title>
				<aug>
					<au>
						<snm>Curwen</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Eyras</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Andrews</snm>
						<fnm>TD</fnm>
					</au>
					<au>
						<snm>Clarke</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Mongin</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Searle</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Clamp</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>942</fpage>
				<lpage>950</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">479124</pubid>
						<pubid idtype="pmpid" link="fulltext">15123590</pubid>
						<pubid idtype="doi">10.1101/gr.1858004</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>GeneLynx: a gene-centric portal to the human genome</p>
				</title>
				<aug>
					<au>
						<snm>Lenhard</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Hayes</snm>
						<fnm>WS</fnm>
					</au>
					<au>
						<snm>Wasserman</snm>
						<fnm>WW</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>2151</fpage>
				<lpage>2157</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">311228</pubid>
						<pubid idtype="pmpid" link="fulltext">11731507</pubid>
						<pubid idtype="doi">10.1101/gr.199801</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>BIND: the Biomolecular Interaction Network Database</p>
				</title>
				<aug>
					<au>
						<snm>Bader</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Betel</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Hogue</snm>
						<fnm>CW</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>248</fpage>
				<lpage>250</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165503</pubid>
						<pubid idtype="pmpid" link="fulltext">12519993</pubid>
						<pubid idtype="doi">10.1093/nar/gkg056</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Ulysses</p>
				</title>
				<url>http://www.cisreg.ca/ulysses</url>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing</p>
				</title>
				<aug>
					<au>
						<snm>Simpson</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Wellenreuther</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Poustka</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Pepperkok</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Wiemann</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>EMBO Reports</source>
				<pubdate>2000</pubdate>
				<volume>1</volume>
				<fpage>287</fpage>
				<lpage>292</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1083732</pubid>
						<pubid idtype="pmpid" link="fulltext">11256614</pubid>
						<pubid idtype="doi">10.1093/embo-reports/kvd058</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>A visual intracellular classification strategy for uncharacterized human proteins</p>
				</title>
				<aug>
					<au>
						<snm>Hoja</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Wahlestedt</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Hoog</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Exp Cell Res</source>
				<pubdate>2000</pubdate>
				<volume>259</volume>
				<fpage>239</fpage>
				<lpage>246</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/excr.2000.4948</pubid>
						<pubid idtype="pmpid" link="fulltext">10942595</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>The Brix domain protein family -- a key to the ribosomal biogenesis pathway?</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Wechselberger</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kreil</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Trends Biochem Sci</source>
				<pubdate>2001</pubdate>
				<volume>26</volume>
				<fpage>345</fpage>
				<lpage>347</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0968-0004(01)01851-5</pubid>
						<pubid idtype="pmpid" link="fulltext">11406393</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Mammalian class E Vps proteins, SBP1 and mVps2/CHMP2A, interact with and regulate the function of an AAA-ATPase SKD1/Vps4B</p>
				</title>
				<aug>
					<au>
						<snm>Fujita</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Umezuki</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Imamura</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Ishikawa</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Uchimura</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nara</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Yoshimori</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hayashizaki</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Kawai</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ishidoh</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Tanaka</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Himeno</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Cell Sci</source>
				<pubdate>2004</pubdate>
				<volume>117</volume>
				<fpage>2997</fpage>
				<lpage>3009</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1242/jcs.01170</pubid>
						<pubid idtype="pmpid" link="fulltext">15173323</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Accelerated discovery of novel protein function in cultured human cells</p>
				</title>
				<aug>
					<au>
						<snm>Hodges</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Redelius</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Hoog</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Mol Cell Proteomics</source>
				<pubdate>2005</pubdate>
				<volume>4</volume>
				<fpage>1319</fpage>
				<lpage>1327</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/mcp.M500117-MCP200</pubid>
						<pubid idtype="pmpid" link="fulltext">15965266</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>CHMP1 functions as a member of a newly defined family of vesicle trafficking proteins</p>
				</title>
				<aug>
					<au>
						<snm>Howard</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Stauffer</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Degnin</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Hollenberg</snm>
						<fnm>SM</fnm>
					</au>
				</aug>
				<source>J Cell Sci</source>
				<pubdate>2001</pubdate>
				<volume>114</volume>
				<fpage>2395</fpage>
				<lpage>2404</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11559748</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>CHMP1 is a novel nuclear matrix protein affecting chromatin structure and cell-cycle progression</p>
				</title>
				<aug>
					<au>
						<snm>Stauffer</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Howard</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Nyun</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hollenberg</snm>
						<fnm>SM</fnm>
					</au>
				</aug>
				<source>J Cell Sci</source>
				<pubdate>2001</pubdate>
				<volume>114</volume>
				<fpage>2383</fpage>
				<lpage>2393</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11559747</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Functional analysis in yeast of the Brix protein superfamily involved in the biogenesis of ribosomes</p>
				</title>
				<aug>
					<au>
						<snm>Bogengruber</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Briza</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Doppler</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Wimmer</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Koller</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Fasiolo</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Senger</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Hegemann</snm>
						<fnm>JH</fnm>
					</au>
					<au>
						<snm>Breitenbach</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>FEMS Yeast Res</source>
				<pubdate>2003</pubdate>
				<volume>3</volume>
				<fpage>35</fpage>
				<lpage>43</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12702244</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>A novel complex of membrane proteins required for formation of a spherical nucleus</p>
				</title>
				<aug>
					<au>
						<snm>Siniossoglou</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Santos-Rosa</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Rappsilber</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Mann</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hurt</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Embo J</source>
				<pubdate>1998</pubdate>
				<volume>17</volume>
				<fpage>6449</fpage>
				<lpage>6464</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1170993</pubid>
						<pubid idtype="pmpid" link="fulltext">9822591</pubid>
						<pubid idtype="doi">10.1093/emboj/17.22.6449</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Structure and assembly of the Nup84p complex</p>
				</title>
				<aug>
					<au>
						<snm>Siniossoglou</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lutzmann</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Santos-Rosa</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Leonard</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Mueller</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Aebi</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Hurt</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>J Cell Biol</source>
				<pubdate>2000</pubdate>
				<volume>149</volume>
				<fpage>41</fpage>
				<lpage>54</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1083/jcb.149.1.41</pubid>
						<pubid idtype="pmpid" link="fulltext">10747086</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Proteomic analysis of the mammalian nuclear pore complex</p>
				</title>
				<aug>
					<au>
						<snm>Cronshaw</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Krutchinsky</snm>
						<fnm>AN</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Chait</snm>
						<fnm>BT</fnm>
					</au>
					<au>
						<snm>Matunis</snm>
						<fnm>MJ</fnm>
					</au>
				</aug>
				<source>J Cell Biol</source>
				<pubdate>2002</pubdate>
				<volume>158</volume>
				<fpage>915</fpage>
				<lpage>927</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1083/jcb.200206106</pubid>
						<pubid idtype="pmpid" link="fulltext">12196509</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>The conserved Nup107-160 complex is critical for nuclear pore complex assembly</p>
				</title>
				<aug>
					<au>
						<snm>Walther</snm>
						<fnm>TC</fnm>
					</au>
					<au>
						<snm>Alves</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Pickersgill</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Loiodice</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Hetzer</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Galy</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Hulsmann</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Kocher</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Wilm</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Allen</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Mattaj</snm>
						<fnm>IW</fnm>
					</au>
					<au>
						<snm>Doye</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2003</pubdate>
				<volume>113</volume>
				<fpage>195</fpage>
				<lpage>206</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(03)00235-6</pubid>
						<pubid idtype="pmpid" link="fulltext">12705868</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>The Universal Protein Resource (UniProt)</p>
				</title>
				<aug>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>CH</fnm>
					</au>
					<au>
						<snm>Barker</snm>
						<fnm>WC</fnm>
					</au>
					<au>
						<snm>Boeckmann</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Ferro</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gasteiger</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Magrane</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Natale</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>O'Donovan</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Redaschi</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Yeh</snm>
						<fnm>LS</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33 Database Issue</volume>
				<fpage>D154</fpage>
				<lpage>9</lpage>
			</bibl>
			<bibl id="B37">
				<title>
					<p>GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support</p>
				</title>
				<aug>
					<au>
						<snm>Rebhan</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Chalifa-Caspi</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Prilusky</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Lancet</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>1998</pubdate>
				<volume>14</volume>
				<fpage>656</fpage>
				<lpage>664</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/14.8.656</pubid>
						<pubid idtype="pmpid" link="fulltext">9789091</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Entrez Gene: gene-centered information at NCBI</p>
				</title>
				<aug>
					<au>
						<snm>Maglott</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Ostell</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Pruitt</snm>
						<fnm>KD</fnm>
					</au>
					<au>
						<snm>Tatusova</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33</volume>
				<fpage>D54</fpage>
				<lpage>8</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">539985</pubid>
						<pubid idtype="pmpid" link="fulltext">15608257</pubid>
						<pubid idtype="doi">10.1093/nar/gki031</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Large-scale analysis of the human and mouse transcriptomes</p>
				</title>
				<aug>
					<au>
						<snm>Su</snm>
						<fnm>AI</fnm>
					</au>
					<au>
						<snm>Cooke</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Ching</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Hakak</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Walker</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Wiltshire</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Orth</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Vega</snm>
						<fnm>RG</fnm>
					</au>
					<au>
						<snm>Sapinoso</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Moqrich</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Patapoutian</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hampton</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Schultz</snm>
						<fnm>PG</fnm>
					</au>
					<au>
						<snm>Hogenesch</snm>
						<fnm>JB</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci U S A</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>4465</fpage>
				<lpage>4470</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">123671</pubid>
						<pubid idtype="pmpid" link="fulltext">11904358</pubid>
						<pubid idtype="doi">10.1073/pnas.012025199</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine</p>
				</title>
				<aug>
					<au>
						<snm>Donaldson</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>de Bruijn</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Wolting</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Lay</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Tuekam</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Baskin</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bader</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Michalickova</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Pawson</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hogue</snm>
						<fnm>CW</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2003</pubdate>
				<volume>4</volume>
				<fpage>11</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">153503</pubid>
						<pubid idtype="pmpid" link="fulltext">12689350</pubid>
						<pubid idtype="doi">10.1186/1471-2105-4-11</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>"Blogs" and "wikis" are valuable software tools for communication within research groups</p>
				</title>
				<aug>
					<au>
						<snm>Sauer</snm>
						<fnm>IM</fnm>
					</au>
					<au>
						<snm>Bialek</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Efimova</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Schwartlander</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Pless</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Neuhaus</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Artif Organs</source>
				<pubdate>2005</pubdate>
				<volume>29</volume>
				<fpage>82</fpage>
				<lpage>83</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1111/j.1525-1594.2004.29005.x</pubid>
						<pubid idtype="pmpid" link="fulltext">15644088</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>BioWiki</p>
				</title>
				<url>http://www.biowiki.org</url>
			</bibl>
			<bibl id="B43">
				<title>
					<p>The Pleiades Project</p>
				</title>
				<url>http://www.cisreg.ca/pleiades/</url>
			</bibl>
		</refgrp>
	</bm>
</art>
