<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/rss.css" type="text/css"?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/"
    xmlns:cc="http://web.resource.org/cc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:extra="http://www.w3.org/1999/xhtml"
    xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <channel rdf:about="http://www.biomedcentral.com/feeds/editorspicks?journal=bmcbioinformatics&amp;quantity=">
        <title>Editor's picks</title>
        <link>http://www.biomedcentral.com/bmcbioinformatics/</link>
        <description>The editor's pick of recent articles published by BMC Bioinformatics</description>
        <dc:date>2012-05-08T00:00:00Z</dc:date>
        <items>
            <rdf:Seq>
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/87" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/79" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/65" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/48" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/42" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/33" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/18" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/8" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/13/7" />
                                <rdf:li rdf:resource="http://www.biomedcentral.com/1471-2105/12/480" />
                            </rdf:Seq>
        </items>
                 <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </channel>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/87">
        <title>PHYLOViZ: Phylogenetic Inference and Data Visualization
for Sequence Based Typing Methods</title>
        <description>Background:
With the decrease of DNA sequencing costs, sequence-based typing methods are rapidly becoming the goldstandard for epidemiological surveillance. These methods provide reproducible and comparable results neededfor a global scale bacterial population analysis, while retaining their usefulness for local epidemiological surveys.Online databases that collect the generated allelic profiles and associated epidemiological data are available butthis wealth of data remains underused and are frequently poorly annotated since no user-friendly tool exists toanalyze and explore it.
Results:
PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typingmethods, including SNP data generated from whole genome sequence approaches, and associatedepidemiological data. goeBURST and its Minimum Spanning Tree expansion are used for visualizing thepossible evolutionary relationships between isolates. The results can be displayed as an annotated graphoverlaying the query results of any other epidemiological data available.
Conclusions:
PHYLOViZ is a user-friendly software that allows the combined analysis of multiple data sources for microbialepidemiological and population studies. It is freely available at http://www.phyloviz.net.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/87</link>
                <dc:creator>Alexandre P Francisco</dc:creator>
                <dc:creator>Cátia Vaz</dc:creator>
                <dc:creator>Pedro T Monteiro</dc:creator>
                <dc:creator>José Melo-Cristino</dc:creator>
                <dc:creator>Mário Ramirez</dc:creator>
                <dc:creator>João André Carriço</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:87</dc:source>
        <dc:date>2012-05-08T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-87</dc:identifier>
                            <dc:title>Java Application for sequence based typing</dc:title>
                            <dc:description>Phyloviz is an exciting free software for visualizing genotype datasets, isolating populations for epidemiological analysis, vizualising potential evolutionary relationships between bacteria and providing access to publicly available data plus potential to export high resolution images.</dc:description>
                <prism:require>/content/figures/1471-2105-13-87-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>87</prism:startingPage>
        <prism:publicationDate>2012-05-08T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/79">
        <title>Inferring high-confidence human protein-protein interactions</title>
        <description>Background:
As numerous experimental factors drive the acquisition, identification, and interpretation of protein-protein interactions (PPIs), aggregated assemblies of human PPI data invariably contain experiment-dependent noise. Ascertaining the reliability of PPIs collected from these diverse studies and scoring them to infer high-confidence networks is a non-trivial task. Moreover, a large number of PPIs share the same number of reported occurrences, making it impossible to distinguish the reliability of these PPIs and rank-order them. For example, for the data analyzed here, we found that the majority (&gt;83 %) of currently available human PPIs have been reported only once.
Results:
In this work, we proposed an unsupervised statistical approach to score a set of diverse, experimentally identified PPIs from nine primary databases to create subsets of high-confidence human PPI networks. We evaluated this ranking method by comparing it with other methods and assessing their ability to retrieve protein associations from a number of diverse and independent reference sets. These reference sets contain known biological data that are either directly or indirectly linked to interactions between proteins. We quantified the average effect of using ranked protein interaction data to retrieve this information and showed that, when compared to randomly ranked interaction data sets, the proposed method created a larger enrichment (~134 %) than either ranking based on the hypergeometric test (~109 %) or occurrence ranking (~46 %).
Conclusions:
From our evaluations, it was clear that ranked interactions were always of value because higher-ranked PPIs had a higher likelihood of retrieving high-confidence experimental data. Reducing the noise inherent in aggregated experimental PPIs via our ranking scheme further increased the accuracy and enrichment of PPIs derived from a number of biologically relevant data sets. These results suggest that using our high-confidence protein interactions at different levels of confidence will help clarify the topological and biological properties associated with human protein networks.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/79</link>
                <dc:creator>Xueping Yu</dc:creator>
                <dc:creator>Anders Wallqvist</dc:creator>
                <dc:creator>Jaques Reifman</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:79</dc:source>
        <dc:date>2012-05-04T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-79</dc:identifier>
                            <dc:title>Ranking protein-protein interactions</dc:title>
                            <dc:description>A novel unsupervised statistical approach for ranking protein-protein interactions called interaction detection based on shuffling reduces inherent experiment-dependent noise and increases the likelihood of detecting high confidence protein interactions.</dc:description>
                <prism:require>/content/figures/1471-2105-13-79-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>79</prism:startingPage>
        <prism:publicationDate>2012-05-04T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/65">
        <title>The MULTICOM Toolbox for Protein Structure Prediction</title>
        <description>Background:
As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or &lt; 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources.
Results:
To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction.
Conclusions:
These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research.  It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/65</link>
                <dc:creator>Jianlin Cheng</dc:creator>
                <dc:creator>Jilong Li</dc:creator>
                <dc:creator>Zheng Wang</dc:creator>
                <dc:creator>Jesse Eickholt</dc:creator>
                <dc:creator>Xin Deng</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:65</dc:source>
        <dc:date>2012-04-30T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-65</dc:identifier>
                            <dc:title>CASP tested toolbox for protein structure prediction</dc:title>
                            <dc:description>MULTICOM Toolbox contains an array of extensively tested and high performance tools aimed at reducing the growing gap between derived protein sequences and structural determination,  that includes both secondary and tertiary structure predictors</dc:description>
                <prism:require>/content/figures/1471-2105-13-65-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>65</prism:startingPage>
        <prism:publicationDate>2012-04-30T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/48">
        <title>Mapsembler, targeted and micro assembly of large NGS
datasets on a desktop computer
</title>
        <description>Background:
The analysis of next-generation sequencing data from large genomes is a timely research topic. Sequencers are producing billions of short sequence fragments from newly sequenced organisms. Computational methods for reconstructing whole genomes/transcriptomes (de novo assemblers) are typically employed to process such data. However, these methods require large memory resources and computation time. Many basic biological questions could be answered targeting specific information in the reads, thus avoiding complete assembly.
Results:
We present Mapsembler, an iterative micro and targeted assembler which processes large datasets of reads on commodity hardware. Mapsembler checks for the presence of given regions of interest that can be constructed from reads and builds a short assembly around it, either as a plain sequence or as a graph, showing contextual structure. We introduce new algorithms to retrieve approximate occurrences of a sequence from reads and construct an extension graph.Among other results presented in this paper, Mapsembler enabled to retrieve previously described human breast cancer candidate fusion genes, and to detect new ones not previously known.
Conclusions:
Mapsembler is the first software that enables de novo discovery around a region of interest of repeats, SNPs, exon skipping, gene fusion, as well as other structural events, directly from raw sequencing reads. As indexing is localized, the memory footprint of Mapsembler is negligible. Mapsembler is released under the CeCILL license and can be freely downloaded from http://alcovna.genouest.org/mapsembler/.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/48</link>
                <dc:creator>Pierre Peterlongo</dc:creator>
                <dc:creator>Rayan Chikhi</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:48</dc:source>
        <dc:date>2012-03-23T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-48</dc:identifier>
                            <dc:title>A novel greedy algorithm for NGS assembly</dc:title>
                            <dc:description>Mapsembler, a sequence assembly tool for desktop computers, uses NGS reads from newly sequenced, non-assembled genomes or transcriptomes to provide targeted assemblies and has retrieved known gene fusions in human cancer and discovered new ones</dc:description>
                <prism:require>/content/figures/1471-2105-13-48-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>48</prism:startingPage>
        <prism:publicationDate>2012-03-23T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/42">
        <title>Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community</title>
        <description>Background:
A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained  from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure.
Results:
Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool&apos;s functionality is fully described in the documentation directly accessible from the graphical interface of the VM.  Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds.
Conclusions:
CloudBioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/42</link>
                <dc:creator>Konstantinos Krampis</dc:creator>
                <dc:creator>Tim Booth</dc:creator>
                <dc:creator>Brad Chapman</dc:creator>
                <dc:creator>Bela Tiwari</dc:creator>
                <dc:creator>Mesude Bicak</dc:creator>
                <dc:creator>Dawn Field</dc:creator>
                <dc:creator>Karen Nelson</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:42</dc:source>
        <dc:date>2012-03-19T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-42</dc:identifier>
                            <dc:title>Virtual machine for high-performance bioinformatics</dc:title>
                            <dc:description>Cloud BioLinux includes over 135 pre-configured bioinformatics tools for sequence alignment, clustering, assembly, display and phylogeny, aimed at reducing costs to researchers of maintaining and configuring hardware, and encouraging sharing of codebase.</dc:description>
                <prism:require>/content/figures/1471-2105-13-42-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>42</prism:startingPage>
        <prism:publicationDate>2012-03-19T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/33">
        <title>BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins</title>
        <description>Background:
Automated function prediction has played a central role in determining the biological functions of bacterial proteins. Typically, protein function annotation relies on homology, and function is inferred from other proteins with similar sequences. This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. However, the existing solutions produce erroneous predictions in many cases, especially when query sequences have low levels of identity with the annotated source protein. This problem has created a pressing need for improvements in homology-based annotation.
Results:
We present an automated method for the functional annotation of bacterial protein sequences. Based on sequence similarity searches, BLANNOTATOR accurately annotates query sequences with one-line summary descriptions of protein function. It groups sequences identified by BLAST into subsets according to their annotation and bases its prediction on a set of sequences with consistent functional information. We show the results of BLANNOTATOR&apos;s performance in sets of bacterial proteins with known functions. We simulated the annotation process for 3090 SWISS-PROT proteins using a database in its state preceding the functional characterisation of the query protein. For this dataset, our method outperformed the five others that we tested, and the improved performance was maintained even in the absence of highly related sequence hits. We further demonstrate the value of our tool by analysing the putative proteome of Lactobacillus crispatus strain ST1.
Conclusions:
BLANNOTATOR is an accurate method for bacterial protein function prediction. It is practical for genome-scale data and does not require pre-existing sequence clustering; thus, this method suits the needs of bacterial genome and metagenome researchers. The method and a web-server are available at http://ekhidna.biocenter.helsinki.fi/poxo/blannotator/.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/33</link>
                <dc:creator>Matti Kankainen</dc:creator>
                <dc:creator>Teija Ojala</dc:creator>
                <dc:creator>Liisa Holm</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:33</dc:source>
        <dc:date>2012-02-15T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-33</dc:identifier>
                            <dc:title>A Bacterial genome protein function predictor</dc:title>
                            <dc:description>BLANNOTATOR, an automated, accurate method for bacterial genome annotation,  applicable to metagenomics research as it does not require pre-existing sequence clustering, has outperformed five tested methods even in the absence of highly similar sequence hits.</dc:description>
                <prism:require>/content/figures/1471-2105-13-33-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>33</prism:startingPage>
        <prism:publicationDate>2012-02-15T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>PDF</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/18">
        <title>Propagating semantic information in biochemical network models</title>
        <description>Background:
To enable automatic searches, alignments, and model combination, the elements of systems biology models need to be compared and matched across models. Elements can be identified by machine-readable biological annotations, but assigning such annotations and matching non-annotated elements is tedious work and calls for automation.
Results:
A new method called &quot;semantic propagation&quot; allows the comparison of model elements based not only on their own annotations, but also on annotations of surrounding elements in the network. One may either propagate feature vectors, describing the annotations of individual elements, or quantitative similarities between elements from different models. Based on semantic propagation, we align partially annotated models and find annotations for non-annotated model elements.
Conclusions:
Semantic propagation and model alignment are included in the open-source library semanticSBML, available on sourceforge. Online services for model alignment and for annotation prediction can be used at http://www.semanticsbml.org.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/18</link>
                <dc:creator>Marvin Schulz</dc:creator>
                <dc:creator>Edda Klipp</dc:creator>
                <dc:creator>Wolfram Liebermeister</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:18</dc:source>
        <dc:date>2012-01-30T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-18</dc:identifier>
                            <dc:title>Semantic annotation of network models</dc:title>
                            <dc:description>Semantic propagation is a new method that applies annotation to systems biology models, enabling comparisons based on both their own annotations, and those of surrounding elements in the network, facilitating the alignment of models with missing annotations</dc:description>
                <prism:require>/content/figures/1471-2105-13-18-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>18</prism:startingPage>
        <prism:publicationDate>2012-01-30T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>XML</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/8">
        <title>An integrative variant analysis suite for whole exome next-generation sequencing data</title>
        <description>Background:
Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data.
Results:
Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454). The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%).
Conclusion:
We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at http://sourceforge.net/projects/atlas2/. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/8</link>
                <dc:creator>Danny Challis</dc:creator>
                <dc:creator>Jin Yu</dc:creator>
                <dc:creator>Uday S Evani</dc:creator>
                <dc:creator>Andrew R Jackson</dc:creator>
                <dc:creator>Sameer Paithankar</dc:creator>
                <dc:creator>Cristian Coarfa</dc:creator>
                <dc:creator>Aleksandar Milosavljevic</dc:creator>
                <dc:creator>Richard A Gibbs</dc:creator>
                <dc:creator>Fuli Yu</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:8</dc:source>
        <dc:date>2012-01-12T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-8</dc:identifier>
                            <dc:title>Integrated analysis of exome sequencing data</dc:title>
                            <dc:description>The Atlas2 is an analysis suite for variant calling in whole exome sequencing that combines regression models and user-adjustable cutoffs in order to separate true SNPs and INDELs from sequencing and mapping errors, with high sensitivity</dc:description>
                <prism:require>/content/figures/1471-2105-13-8-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>8</prism:startingPage>
        <prism:publicationDate>2012-01-12T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>XML</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/13/7">
        <title>Protein docking prediction using predicted protein-protein interface</title>
        <description>Background:
Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations.
Results:
We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering.
Conclusion:
We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.</description>
        <link>http://www.biomedcentral.com/1471-2105/13/7</link>
                <dc:creator>Bin Li</dc:creator>
                <dc:creator>Daisuke Kihara</dc:creator>
                <dc:source>BMC Bioinformatics 2012, 13:7</dc:source>
        <dc:date>2012-01-10T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-13-7</dc:identifier>
                            <dc:title>A better predictor of protein interaction</dc:title>
                            <dc:description>A novel pairwise protein docking algorithm, PI-LZerD, is an important development applicable to multiple protein docking and valuable for providing pictures of interactions in network analyses, that performs better than alternatives in benchmark experiment.</dc:description>
                <prism:require>/content/figures/1471-2105-13-7-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>13</prism:volume>
        <prism:startingPage>7</prism:startingPage>
        <prism:publicationDate>2012-01-10T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>XML</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <item rdf:about="http://www.biomedcentral.com/1471-2105/12/480">
        <title>GC-Content Normalization for RNA-Seq Data</title>
        <description>Background:
Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof.
Results:
We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq.
Conclusions:
Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes.</description>
        <link>http://www.biomedcentral.com/1471-2105/12/480</link>
                <dc:creator>Davide Risso</dc:creator>
                <dc:creator>Katja Schwartz</dc:creator>
                <dc:creator>Gavin Sherlock</dc:creator>
                <dc:creator>Sandrine Dudoit</dc:creator>
                <dc:source>BMC Bioinformatics 2011, 12:480</dc:source>
        <dc:date>2011-12-17T00:00:00Z</dc:date>
        <dc:identifier>10.1186/1471-2105-12-480</dc:identifier>
                            <dc:title>Bias correction for RNA-seq data</dc:title>
                            <dc:description>The combination of three different strategies for GC-content normalization of RNA-seq data leads to more accurate estimations of gene expression levels and fold-changes, making statistical inference of differential expression less prone to false discoveries.</dc:description>
                <prism:require>/content/figures/1471-2105-12-480-toc.gif</prism:require>
                <prism:publicationName>BMC Bioinformatics</prism:publicationName>
        <prism:issn>1471-2105</prism:issn>
        <prism:volume>12</prism:volume>
        <prism:startingPage>480</prism:startingPage>
        <prism:publicationDate>2011-12-17T00:00:00Z</prism:publicationDate>
                <prism:versionidentifier>XML</prism:versionidentifier>
                <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
    </item>
        <cc:License rdf:about="http://creativecommons.org/licenses/by/2.0/">
        <cc:permits rdf:resource="http://creativecommons.org/ns#Reproduction" />
        <cc:permits rdf:resource="http://creativecommons.org/ns#Distribution" />
        <cc:permits rdf:resource="http://creativecommons.org/ns#DerivativeWorks" />
    </cc:License>
</rdf:RDF>

