<?xml version = '1.0' encoding = 'UTF-8'?>
<?xml-stylesheet href="/rss/styledrssBMC.css" type="text/css"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:extra="http://www.biomedcentral.com/xml/schemas/extra/" xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/" xmlns:cc="http://web.resource.org/cc/">
	<channel rdf:about="http://www.biomedcentral.com/rss">
		<extra:info rdf:parseType="Literal">
			<html:div xmlns:html="http://www.w3.org/1999/xhtml" style="font:14px Verdana, Geneva, Arial, Helvetica, sans-serif">
				<html:span style="font-weight:bold">This is an RSS newsfeed from BioMed Central</html:span>
				<html:br/>
				<html:span style="font-size: 12px;">It is intended to be used with an RSS reader. For more information about RSS newsfeeds from BioMed Central, visit <html:br/><html:a href="http://www.biomedcentral.com/info/about/rss/" style="color:#3333CC; font-size:12px;">http://www.biomedcentral.com/info/about/rss/</html:a><html:br/>
				</html:span>
			</html:div>
		</extra:info>
		<title>BMC Bioinformatics - Latest articles</title>
		<link>http://www.biomedcentral.com/bmcbioinformatics/</link>
		<description>The latest articles from BMC Bioinformatics (ISSN 1471-2105) published by 
				
				BioMed Central
		</description>
        <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        <items>
            <rdf:Seq>
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/233"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/232"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/231"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/230"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/229"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/228"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/227"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/226"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/225"/>			    
            
				    <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/9/224"/>			    
            
            </rdf:Seq>
        </items>
    </channel>  
    
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/233">
            
            <title>Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes </title>
			<description>Background:
Bacterial promoters, which increase the efficiency of gene expression, differ from other promoters by several characteristics. This difference, not yet widely exploited in bioinformatics, looks promising for the development of relevant computational tools to search for strong promoters in bacterial genomes. 
Results:
We describe a new triad pattern algorithm that predicts strong promoter candidates in annotated bacterial genomes by matching specific patterns for the group I sigma 70 factors of Escherichia coli RNA polymerase. It detects promoter-specific motifs by consecutively matching three patterns, consisting of an UP-element, required for interaction with the alpha subunit, and then optimally-separated patterns of -35 and -10 boxes, required for interaction with the sigma 70 subunit of RNA polymerase. Analysis of 43 bacterial genomes revealed that the frequency of candidate sequences depends on the A+T content of the DNA under examination. The accuracy of in silico prediction was experimentally validated for the genome of a hyperthermophilic bacterium, Thermotoga maritima, by applying a cell-free expression assay using the predicted strong promoters. In this organism, the strong promoters govern genes for translation, energy metabolism, transport, cell movement, and other as-yet unidentified functions. 
Conclusions:
The triad pattern algorithm developed for predicting strong bacterial promoters is well suited for analyzing bacterial genomes with an A+T content of less than 62%. This computational tool opens new prospects for investigating global gene expression, and individual strong promoters in bacteria of medical and/or economic significance.  </description>
			<link>http://www.biomedcentral.com/1471-2105/9/233</link>
			
			 	<dc:creator>Michael Dekhtyar, Amelie Morin and Vehary Sakanyan</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:233</dc:source>
			<dc:date>2008-05-09</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-233</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>233</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-09</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/232">
            
            <title>Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites
</title>
			<description>Background:
Utilization of alternative initiation sites for protein translation directed by non-AUG codons in mammalian mRNAs is observed with increasing frequency.  Alternative initiation sites are utilized for the synthesis of important regulatory proteins that control distinct biological functions.  It is, therefore, of high significance to define the parameters that allow accurate bioinformatic prediction of alternative translation initiation sites (aTIS).  This study has investigated 5'-UTR regions of mRNAs to define consensus sequence properties and structural features that allow identification of alternative initiation sites for protein translation.  
Results:
Bioinformatic evaluation of 5'-UTR sequences of mammalian mRNAs was conducted for classification and identification of alternative translation initiation sites for a group of mRNA sequences that have been experimentally demonstrated to utilize alternative non-AUG initiation sites for protein translation.  These are represented by the codons CUG, GUG, UUG, AUA, and ACG for aTIS.  The first phase of this bioinformatic analysis implements a classification tree that evaluated 5'-UTRs for unique consensus sequence features near the initiation codon, characteristics of 5'-UTR nucleotide sequences, and secondary structural features in a decision tree that categorizes mRNAs into those with potential aTIS, and those without.  The second phase addresses identification of the aTIS codon and its location.  Critical parameters of 5'-UTRs were assessed by an Artificial Neural Network (ANN) for identification of the aTIS codon and its location.  ANNs have previously been used for the purpose of AUG start site prediction and are applicable in complex.  ANN analyses demonstrated that multiple properties were required for predicting aTIS codons; these properties included unique consensus nucleotide sequences at positions -7 and -6 combined with positions -3 and +4, 5'-UTR length, ORF length, predicted secondary structures, free energy features, upstream AUGs, and G/C ratio.  Importantly, combined results of the classification tree and the ANN analyses provided highly accurate bioinformatic predictions of alternative translation initiation sites.
Conclusions:
This study has defined the unique properties of 5'-UTR sequences of mRNAs for successful bioinformatic prediction of alternative initiation sites utilized in protein translation.  The ability to define aTIS through the described bioinformatic analyses can be of high importance for genomic analyses to provide full predictions of translated mammalian and human gene products required for cellular functions in health and disease.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/232</link>
			
			 	<dc:creator>Jill L Wegrzyn, Thomas M Drudge, Faramarz Valafar and Vivian Hook</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:232</dc:source>
			<dc:date>2008-05-08</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-232</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>232</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-08</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/231">
            
            <title>Automating dChip: toward reproducible sharing of microarray data analysis</title>
			<description>Background:
During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dChip users want to analyze large number of arrays automatically and share data analysis procedures and parameters. Improvement is also needed when the dChip user support team tries to identify the causes of reported analysis errors or bugs from users. 
Results:
We report here implementation and application of the dChip automation module. Through this module, dChip automation files can be created to include menu steps, parameters, and data viewpoints to run automatically. A data-packaging function allows convenient transfer from one user to another of the dChip software, microarray data, and analysis procedures, so that the second user can reproduce the entire analysis session of the first user. An analysis report file can also be generated during an automated run, including analysis logs, user comments, and viewpoint screenshots. 
Conclusions:
The dChip automation module is a step toward reproducible research, and it can prompt a more convenient and reproducible mechanism for sharing microarray software, data, and analysis procedures and results. Automation data packages can also be used as publication supplements. Similar automation mechanisms could be valuable to the research community if implemented in other genomics and bioinformatics software packages.  </description>
			<link>http://www.biomedcentral.com/1471-2105/9/231</link>
			
			 	<dc:creator>Cheng Li</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:231</dc:source>
			<dc:date>2008-05-08</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-231</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>231</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-08</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/230">
            
            <title>CPSP-tools - exact and complete algorithms for high-throughput 3D lattice protein studies</title>
			<description>Background:
The principles of protein folding and evolution pose 
problems of very high inherent complexity. 
Often these problems are tackled using simplified protein models,
e.g. lattice proteins.
The CPSP-tools package provides programs to solve exactly and completely
the problems typical of studies using 3D lattice protein models. 
Among the tasks addressed are the prediction of (all) globally optimal
and/or suboptimal structures as well as sequence design and neutral 
network exploration.
Results:
In contrast to stochastic approaches, which are
not capable of answering many fundamental questions, our methods
are based on fast, non-heuristic techniques.
The  resulting tools are designed for high-throughput studies of
3D-lattice proteins utilising the Hydrophobic-Polar (HP) model.
The source bundle is freely available at
http://www.bioinf.uni-freiburg.de/sw/cpsp/
Conclusions:
The CPSP-tools package is the first set
of exact and complete methods for 
extensive, high-throughput studies of non-restricted 3D-lattice
protein models. In particular, our package deals with cubic and
face centered cubic (FCC) lattices.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/230</link>
			
			 	<dc:creator>Martin Mann, Sebastian Will and Rolf Backofen</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:230</dc:source>
			<dc:date>2008-05-07</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-230</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>230</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-07</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/229">
            
            <title>A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences</title>
			<description>Background:
The structure of many eukaryotic cell regulatory proteins is highly modular.
They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs.
The latter are involved in protein interactions and formation of regulatory complexes.
The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules.
It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant.
Results:
We have developed a method for scoring the conservation of linear motif instances.
It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns.
On our benchmarking, the method accurately scores  86% of the known positive instances, while distinguishing them from random matches in 78% of the cases.
The conservation score is implemented as a real time application designed to be integrated into other tools.
It is currently accessible via a Web Service or through a graphical interface.
Conclusions:
The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences.
It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/229</link>
			
			 	<dc:creator>Claudia Chica, Alberto Labarga, Cathryn M Gould, Rodrigo Lopez and Toby J Gibson</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:229</dc:source>
			<dc:date>2008-05-06</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-229</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>229</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-06</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/228">
            
            <title>Inferring the role of transcription factors in regulatory networks</title>
			<description>Background:
Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays.
Results:
We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the
literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We
additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles . We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions.
Conclusions:
Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data,
real and artificial, that a relatively small  number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/228</link>
			
			 	<dc:creator>Philippe Veber, Carito Guziolowski, Michel Le Borgne, Ovidiu Radulescu and Anne Siegel</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:228</dc:source>
			<dc:date>2008-05-06</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-228</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>228</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-06</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/227">
            
            <title>The pairwise disconnectivity index as a new metric for the topological analysis of regulatory networks</title>
			<description>Background:
Currently, there is a gap between purely theoretical studies of the topology of large bioregulatory networks and the practical traditions and interests of experimentalists. While the theoretical approaches emphasize the global characterization of regulatory systems, the practical approaches focus on the role of distinct molecules and genes in regulation. To bridge the gap between these opposite approaches, one needs to combine 'general' with 'particular' properties and translate abstract topological features of large systems into testable functional characteristics of individual components. Here, we propose a new topological parameter - the pairwise disconnectivity index of a network's element - that is capable of such bridging.
Results:
The pairwise disconnectivity index quantifies how crucial an individual element is for sustaining the communication ability between connected pairs of vertices in a network that is displayed as a directed graph. Such an element might be a vertex (i.e., molecules, genes), an edge (i.e., reactions, interactions), as well as a group of vertices and/or edges. The index can be viewed as a measure of topological redundancy of regulatory paths which connect different parts of a given network and as a measure of sensitivity (robustness) of this network to the presence (absence) of each individual element. Accordingly, we introduce the notion of a path-degree of a vertex in terms of its corresponding incoming, outgoing and mediated paths, respectively. The pairwise disconnectivity index has been applied to the analysis of several regulatory networks from various organisms. The importance of an individual vertex or edge for the coherence of the network is determined by the particular position of the given element in the whole network. 
Conclusions:
Our approach enables to evaluate the effect of removing each element (i.e., vertex, edge, or their combinations) from a network. The greatest potential value of this approach is its ability to systematically analyze the role of every element, as well as groups of elements, in a regulatory network. </description>
			<link>http://www.biomedcentral.com/1471-2105/9/227</link>
			
			 	<dc:creator>Anatolij P. Potapov, Bjorn Goemann and Edgar Wingender</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:227</dc:source>
			<dc:date>2008-05-02</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-227</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>227</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-02</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/226">
            
            <title>SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences</title>
			<description>Background:
Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction.
Results:
SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. 
Conclusions:
The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/226</link>
			
			 	<dc:creator>Lukasz Kurgan, Krzysztof Cios and Ke Chen</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:226</dc:source>
			<dc:date>2008-05-01</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-226</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>226</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-01</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/225">
            
            <title>A copula method for modeling directional dependence of genes
</title>
			<description>Background:
Genes interact with each other as basic building blocks of life, forming a complicated network. The relationship between groups of genes with different functions can be represented as gene networks. With the deposition of huge microarray data sets in public domains, study on gene networking is now possible. In recent years, there has been an increasing interest in the reconstruction of gene networks from gene expression data. Recent work includes linear models, Boolean network models, and Bayesian networks. Among them, Bayesian networks seem to be the most effective in constructing gene networks. A major problem with the Bayesian network approach is the excessive computational time. This problem is due to the interactive feature of the method that requires large search space. Since fitting a model by using the copulas does not require iterations, elicitation of the priors, and complicated calculations of posterior distributions, the need for reference to extensive search spaces can be eliminated leading to manageable computational affords. Bayesian network approach produces a discretely expression of conditional probabilities. Discreteness of the characteristics is not required in the copula approach which involves use of uniform representation of the continuous random variables. Our method is able to overcome the limitation of Bayesian network method for gene-gene interaction, i.e. information loss due to binary transformation.
Results:
We analyzed the gene interactions for two gene data sets (one group is eight histone genes and the other group is 19 genes which include DNA polymerases, DNA helicase, type B cyclin genes, DNA primases, radiation sensitive genes, repaire related genes, replication protein A encoding gene, DNA replication initiation factor, securin gene, nucleosome assembly factor, and a subunit of the cohesin complex) by adopting a measure of directional dependence based on a copula function.  We have compared our results with those from other methods in the literature. Although microarray results show a  transcriptional co-regulation pattern and do not imply that the gene products are physically interactive, this tight genetic connection may suggest that each gene product has either direct or indirect connections between the other gene products. Indeed, recent comprehensive analysis of a protein interaction map revealed that those histone genes are physically connected with each other, supporting the results obtained by our method.
Conclusions:
The results illustrate that our method can be an alternative to Bayesian networks in modeling gene interactions. One advantage of our approach is that dependence between genes is not assumed to be linear. Another advantage is that our approach can detect directional dependence. We expect that our study may help to design artificial drug candidates, which can block or activate biologically meaningful pathways. Moreover, our copula approach can be extended to investigate the effects of local environments on protein-protein interactions. The copula mutual information approach will help to propose the new variant of ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks): an algorithm for the reconstruction of gene regulatory networks.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/225</link>
			
			 	<dc:creator>Jong-Min Kim, Yoon-Sung Jung, Engin A Sungur, Kap-Hoon Han, Changyi Park and Insuk Sohn</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:225</dc:source>
			<dc:date>2008-05-01</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-225</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>225</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-05-01</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
	
		<item rdf:about="http://www.biomedcentral.com/1471-2105/9/224">
            
            <title>Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory</title>
			<description>Background:
The Baum-Welch learning procedure for Hidden Markov Models (HMMs) provides a powerful tool for tailoring HMM topologies to data for use in knowledge discovery and clustering. A linear memory procedure recently proposed by Miklos, I. and Meyer, I.M. describes a memory sparse version of the Baum-Welch algorithm with modifications to the original probabilistic table topologies to make memory use independent of sequence length (and linearly dependent on state number). The original description of the technique has some errors that we amend. We then compare the corrected implementation on a variety of data sets with conventional and checkpointing implementations.
Results:
We provide a correct recurrence relation for the emission parameter estimate and extend it to parameter estimates of the Normal distribution. To accelerate estimation of the prior state probabilities, and decrease memory use, we reverse the        originally proposed forward sweep.  We describe different scaling strategies necessary in all real implementations of the algorithm to prevent underflow. In this paper we also describe our approach to a linear memory implementation of the Viterbi decoding algorithm (with linearity in the sequence length, while memory use is approximately independent of state number). We demonstrate the use of the linear memory implementation on an extended Duration Hidden Markov Model (DHMM) and on an HMM with a spike detection topology. Comparing the various implementations of the Baum-Welch procedure we find that the checkpointing algorithm produces the best overall tradeoff between memory use and speed. In cases where sequence length is very large (for Baum-Welch), or state number is very large (for Viterbi), the linear memory methods outlined may offer some utility.
Conclusions:
Our performance-optimized Java implementations of Baum-Welch algorithm are available at http://logos.cs.uno.edu/~achurban. The described method and implementations will aid sequence alignment, gene structure prediction, HMM profile training, nanopore ionic flow blockades analysis and many other domains that require efficient HMM training with EM.</description>
			<link>http://www.biomedcentral.com/1471-2105/9/224</link>
			
			 	<dc:creator>Alexander Churbanov and Stephen Winters-Hilt</dc:creator>
			
			<dc:source>BMC Bioinformatics 2008, 9:224</dc:source>
			<dc:date>2008-04-30</dc:date>
			<dc:identifier>doi:10.1186/1471-2105-9-224</dc:identifier>
			
			
							
					<prism:publicationName>BMC Bioinformatics</prism:publicationName>
					
			
							
					<prism:issn>1471-2105</prism:issn>
					
			
							
					<prism:volume>9</prism:volume>
					
			
							
					<prism:startingPage>224</prism:startingPage>
					
			
							
					<prism:publicationDate>2008-04-30</prism:publicationDate>
					

            <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
        </item>
		
    <cc:License rdf:about="http://creativecommons.org/licenses/by/2.0/">
         <cc:permits rdf:resource="http://creativecommons.org/ns#Reproduction"/>
         <cc:permits rdf:resource="http://creativecommons.org/ns#Distribution"/>
         <cc:permits rdf:resource="http://creativecommons.org/ns#DerivativeWorks"/>
	</cc:License>
</rdf:RDF>
